GLOBAL FLUCTUATIONS FOR LINEAR STATISTICS OF /3-JACOBI ENSEMBLES 
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"*». l ' Abstract. We study the global fluctuations for linear statistics of the form X)ILl fi^i) as n ~ * °°> f° r C 1 

functions /, and Ai, . . . , A n being the eigenvalues of a (general) /3-Jacobi ensemble |18II29I . The fluctuation 

{—*. from the mean (JDILi /(^i) — ^EILi /(^»)) * s gi ven asymptotically by a Gaussian process. 
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We compute the covariance matrix for the process and show that it is diagonalized by a shifted Cheby- 
shev polynomial basis; in addition, we analyze the deviation from the predicted mean for polynomial test 
functions, and we obtain a law of large numbers. 

1. Introduction 



Global fluctuations for linear statistics, also known as central limit theorems, have been of interest to the 
random matrix community for almost as long as the limiting properties of empirical spectral distributions 
(also known sometimes as laws of large numbers). A variety of models and eigenvalue distributions have 
been studied from this point of view, starting with the classical Gaussian and Wishart matrices [341136] . 
generalizations thereof (Wigner and Wishart-like matrices) [3 1 [TT 1 I2T 1 I3 ^ |3T 1 H2 1 H5 ] . tridiagonal models [171155] . 
different eigenvalue potentials [23], /3-ensembles [THlllH], classical compact groups [T3JI2I], banded matrices 
[21121] , permutations [7J and so on. The methods of approach range from the classical method of moments 
[2J[T7], to free probability [12l[2Tll30l l35] and stochastic calculus [TT] . 

To put it more concretely, we are interested in the following problem. A linear statistic of annxn matrix 
^Jj ■ A with eigenvalues Ai , A2, • • • , A n is a functional of the form 

O- ^):=U/(A<), 

>0 where / is a function (we sometimes refer to them as test functions) belonging to a certain class (which, 

depending on the ensemble to whom A belongs, may be as restrictive as the class of polynomials, or as wide 
as L 2 ). The first issue at hand is to calculate the limit of —J 7 (A) as n — » 00 (in case this exists), in other 
words, to find the limiting empirical spectral distribution for the eigenvalues of A (also known as the law of 
large numbers). The second issue is to examine the fluctuation from the mean, e.g., study 

X f , A := F{A) - KiT{A) , 

perhaps under a suitable scaling, and prove that X^a converges in distribution, here to a centered Gaussian 



variable whose variance depends on /. The term "global" in "global fluctuations" refers to the fact that all 
eigenvalues contribute similarly to J 7 (A). 

The Jacobi ensemble (also known as Double Wishart) is one of many on which such studies have been 
performed. They were introduced in connection with the MANOVA procedure of statistics for measuring 
the likelihood of a multivariate linear model 4, 36 , and found to be of interest in quantum conductance and 
log-gas theory [B][2D]. One can describe them through their eigenvalue distributions 

i i<j 

where Z = Z(n, rii,ri2,/3) is a normalization constant. In full generality, /3 > 0, while n\ and 712 need not 
be positive integers; in fact, the only constraints (which relate to the integrability of the measure) are that 
m, n<i > n — 1. 
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In the case that /3 € {1,2,4} and nx,n-2 6 N, they admit full matrix models (as J = W x (W\ + 
W2)~ 1 W 1 , where Wi, W2 are Wishart matrices, hence the "double Wishart" name. For an extensive study 
of the f3 — 1 case, as well as a clear exposition of how these models arose, we refer to [33]; the other cases 
(13 = 2,4) can be dealt with similarly. 

Recently, it was shown that in these "classical" cases a different kind of model can be constructed, starting 
from random projections, rather than random Wishart matrices; or, equivalently, that "chopping off" an 
appropriate corner of a unitary Haar matrix will yield a matrix whose singular values, squared, are distributed 
according to (JlJ (discovered in 14 , rediscovered in [18) - ). 

The greatest generality is achieved by the tridiagonal model 18, 29J, which covers any j3 > 0, and removes 
the condition that m, ri2 € N. We give the model below (hereafter referred to as the Edelman-Sutton model, 
as it appears most clearly in their work [18]). Given the matrix Bp defined as 

/ c « s n-l \ 



(2) 



B fl 
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1, . . . , (n — 1) obeying the distribution laws and 



with the variables Cj, Si, i — 1, ...,n, and c', s', j 
relationships 

(3) {ci, C2, . . . , c„, c' 1: c 2 , . . . , c'n-i} mutually independent, 

(4) c t ~ ^/Beta(f (ni - n + i), §(n 2 - n + i)) and cj. ~ ^/Beta(§ j, §(rii + n 2 - 2n + 1 + j)) 



(5) 
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the eigenvalues of A = BpBT are distributed according to ([T]) (see fig]). 

We are interested in the behavior of ^/,a as n — >• 00 with (m + n 2 — 2n) growing linearly in n and with 
/? fixed. This is the only scaling regime in which the limiting spectral distribution is truly Jacobi. 

If either ui > n or 112 > n, in the case when (3 — 1,2,4, the Wishart matrices in the full models have 
W\ ~ /3nil n , respectively, Wi ~ $iiil n . For example if n 2 3> n and n 2 3> ni, this heurestic predicts that 
the Double Wishart model behaves like 

Wi(Wi + W2)- 1 w Wi(Wi + p^ldy 1 » Wi/(fim), 

so that appropriately rescaling, Wishart behavior should appear. These heurestics are studied rigorously in 
Jiang 23 . (The symmetric regime, n% 3> n and % > U2, predicts Wishart behavior with a huge shift in 
eigenvalues.) 

Conversely, in the sublinear growth cases, i.e. where (m + n 2 — 2n) ^C n, the Jacobi ensemble takes on 
behavior that looks much more like the classical compact groups. This connection is explicit for /3 = 1,4 
and fixed values of ni — n and n 2 — n (see Proposition 3.1 of [24]). These heurestics predict the correct 
limiting spectral distributions as well. In the superlinear case, the limiting spectral distribution is a point 
mass (easily seen also from [3] which shows that the matrix BpBl is very close to a mulitple of the identity), 
while in the sublinear case, the limiting spectral distribution is the arcsine law. These statements about 
the limiting spectral distributions are straightforward exercises following the approach of Trotter 49 . We 
sketch this approach in the proof of the following proposition. 

Proposition 1.1. Let f be a continuous test function on [0, 1]. 
(1) If n\ + n 2 — In = o(n), then 
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(2) If m/n — > p and n 2 /n — > q, then 

1 



n ^-^ 

8=1 



Wo ^x(l -x) 



v / f{x) dfj,(x), 



dx. 



where /i has density 
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(3) If ni + n% — 2n = cj(n) and if (ni — n)/(n\ 
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Proof. Regardless of the scales of n\ — n and 77,2 — n, the limiting eigenvalue distribution can be understood 
by computing Arx, = B^B^. (Note that on taking the [3 parameter to infinity, the Beta(/3x, fly) variable in 
the matrix model converges in probability to — f— ■ Replacing the Beta variables by these limits in Bp gives 
the matrix B^.) 

By applying Stirling's approximation, it can be shown that there is a constant C depending only on /3 so 
that 



Ci 



m 



m + n 2 



< 



C 



2n + 2i 
A similar bound holds for c[ and for CiSi. Applying all these bounds, it follows that 

(6) Ell^Bj-Bo^H^OOogn). 

From the fundamental realization of Trotter [49] , any o(n) bound on the expected-square Frobenius norm 
suffices to show that the ESDs of two matrix models are converging together asn->oo. 

It is now elementary to check that the limiting spectral distribution for B^B^ is that which is stated in 
the theorem in the sublinear and superlinear cases. In the linear case, we compute the limiting distribution 
by way of the Jacobi differential recurrence formula, which we do in proving Theorem 15.11 (see (|45[1 ). □ 

In our study of the linear scaling regime, we apply a wide array of methods, starting with the method 
of moments (which often boils down to path-counting), special functions (orthogonal polynomial) theory 
and generating functions, as well as one important result from the work of Anderson and Zeitouni 2 (more 
details in Section |4]). 

As mentioned in the introductory paragraph, the study of global fluctuations of linear statistics for random 
matrices spans a wide literature, and covers a broad spectrum of models. We will only mention here a few 
works that are either closely related in scope, in model, or those that have served as inspiration for our study. 

The method of moments, introduced by Wigner himself [5TU52] and used for proving central limit theorems 
for polynomials of Wishart matrices by Jonsson [26], has been employed with great success by Sinai and 
Soshnikov 02], Soshnikov [43], Peche and Soshnikov [38], etc., to obtain both central limit theorems for traces 
of large powers of random matrices and universality results for the fluctuations of the extremal eigenvalues 
in the case of Wigner and Wishart matrices. The method of moments has also been used by Dumitriu and 
Edelman [17] to calculate the fluctuations in the case of /3-Hermite and /3-Laguerre ensembles (generalizations 
of the Gaussian and central Wishart ensembles for /3 = 1, 2, 4), in the case of polynomial functions. It is also 
one essential ingredient in the work of Anderson-Zeitouni [2] on band matrices. 

It is worth mentioning that the method of moments is essentially equivalent in spirit (though not necessar- 
ily in form) to the Stieltjes transform methods used by Bai and Silverstein (e.g., [3]) to calculate central limit 
theorems for generalized Wishart matrices; for a good reference on the methodology involved, we recommend 

Another method for computing fluctuations of linear statistics involves a stochastic calculus approach 
introduced by Cabanal-Duvillard [IT] to prove a central limit theorem for Wishart matrices in the case 
[3 = 2; stochastic calculus was also used by Guionnet [3T] in computing fluctuations for a class of band 
matrices and sample covariance matrices, and by Guionnet and Zeitouni |22] to calculate large deviations 
for a wide class of random matrices. 
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Other approaches to calculating fluctuations for linear functionals for /3-ensembles include the Capitaine 
and Casalis work [12] , which, through free probability, obtains results for both Wishart and Jacobi (Double 
Wishart) matrices in the case (3 = 2. The later work of Kusalik, Mingo, and Speicher 30 builds on [T2] and 
on results obtained by Mingo and Nica [35 to obtain fluctuations (second-order asymptotics) for random 
matrices (also in the case (3 = 2). Finally, Chatterjee [12] has introduced the Stein method to computing 
central limit theorems for a wide class of random matrices, for analytic potentials. 

Specifically in the case of /3- Jacobi ensembles, for an "extremal" class of /3- Jacobi ensembles (when n\ = 
o(y/n2) and n = 0(^/712)), as mentioned before, Jiang [23] has established a series of important results, among 
which are the calculations of fluctuations, through approximation methods. 

For all j3- Jacobi ensembles of fixed parameters, Killip [28] proved that the fluctuations of macroscopic 
statistics obey a CLT; this result is similar to the one we obtain, but in the case that / = \i where / is a 
(fixed, independent of ri) finite union of intervals in [0, 1] and under a different normalization. It is unclear 
how Killip 's result changes if the parameters of the ensemble scale with n, which is the regime studied 
here. In addition, while our method does not allow us to obtain any results for discontinuous functions, it 
seems that going in the opposite direction - using Killip's results to obtain fluctuation theorems for smooth 
functions - would need microscopic statistics, i.e. where the lengths of the intervals shrink with n. As Killip 
notes, the microscopic regime is much more difficult and is not covered in [25] . 

Last but by no means least, we would like to mention that the most comprehensive results for linear 
functionals in the case of /3-ensembles found in the literature have been obtained by Johansson [25 . The 
fluctuations obtained in [25] are true for any /3 > 0, in the case of Hermitian matrices, for a large class 
of (polynomial) potentials, and for a large class of functions / (in its full generality, Johansson's work is 
applicable to H 17 ^ 2 functions, where Tt a stands for the corresponding Sobolev space). The methods are 
analytical and make heavy use of potential theory. In addition to the fluctuations, Johansson was also able 
to obtain the deviation from the mean (second-order asymptotics), for the same class of functions. 

Johansson's results subsume the work [T7] in the case of /3-Hermite matrices (general /?, fixed potential 
V(x) = x 2 ), and have served as a "moral" (albeit not technical) inspiration to us in our quest. 

1.1. Our results. Our purpose in this paper is to calculate the global fluctuations for j3- Jacobi ensembles, 
for as large a class of functions / as possible. By using concentration properties of the Jacobi ensemble and 
making use of a theorem by Anderson and Zeitouni [2], cited below, we were able to obtain the fluctuations 
for all j3 in the case of C 1 test functions on [0, 1]. We only obtain the deviation from the mean for polynomial 
test functions, and conjecture the deviation should extend to a larger class of functions. 

Our asymptotic analysis will occur in the proportional scaling regime, and so we will make the following 
assumptions on the growth of n\ and «2- 

Assumption 1.2. Let m = pn and m = qn for some fixed p, q > 1 having p + q > 2. 

Chebyshev polynomials are an essential ingredient to our proof, both for their analytic properties and 
their combinatorial ones. We define the shifted Chebyshev polynomials of the first kind, T, by 

r„(x)^2r»( 2 " A ' + A+ A ' A - 

where T n are the standard Chebyshev polynomials of the first kind, satisfying T n (cos9) = cos n8. By making 
a change of variables, it immediately follows that {^n(x)}^ =0 are a complete orthonormal system for L 2 (il), 
the weighted L 2 space induced by the inner product 

1 f X+ „, , , , 1 



(/)<?) = 7T~ / f( x ) a ( x ) — -dx. 

Using this inner product, we define the Chebyshev coefficients 

1 f X+ ..,„,, 1 



(7) /(„) = {/,r„) = — / f(x)T n (x)-== dx. 

Our main result is given below. 



Theorem 1.3. Let A be an n x n /3-Jacobi matrix, with n\,ni satisfying Assumvtion M.'A For any fixed 
k G N, the k-tuple (Xr 1} Aj ■ ■ ■ T^v k ,A) converges in distribution to the k-tuple of independent centered normal 
variables (Yi, . . . , Yfe), where Yi has variance %i. Further, for any f continuously differentiable on [0, 1], the 
variable Xf t A converges in distribution to a centered normal variable Yf , with variance given by 



r 2 



o~l 



Z_^ 



n\f{n)\\ 



where f(n) is the n th Chebyshev coefficient, defined as in (f7|). 

Remark 1.4. In analogy with Fourier series on the unit circle, it is alluring to consider the condition above 
for / as requiring one half a derivative, in the L 2 sense; we would expect for 

oo 

t? = 5> a i/(»)f 

n=l 

to behave like the square-L 2 norm of /', and this can be easily established. Precisely, 

rf = ^ f + \f'(x)\W(X + -x)(x-X-)dx, 

where the proof follows from the identity T' n (x) = nU n -i{x), with U the Chebyshev polynomial of the second 
kind, and the orthonormality of {U n } with the weight yl — x 2 . Since r? > <7?, given the C 1 condition for / 
on [A_, A+], the variance in the Theorem II .31 is finite. 

Remark 1.5. Note that the case when p = q = 1 is not covered. This is the case when neither one of the 
exponents of the ensemble grows to oo; the method of proof collapses since one of the main ingredients, the 
ability to get uniform tail bounds for entries of the matrix is no longer true at the "bottom right" corner of 
the matrix, and as such the errors can no longer be accurately estimated by the same means. We present 
the results of some numerical simulations for this case in Section [6] We also note that the theorem is proven 
by Johansson in the /? = 2 case by methods of orthogonal polynomial theory [53] . 

Our second result concerns the deviation from the mean, and is restricted to polynomial functions. 
Theorem 1.6. For any polynomial <ft, 

Etr(0(A)) = n f + <j>{x)dn{x) + (| - 1) f + <t>{x)dv{x) + o(i), 



where fj, is as defined in Theorem li.il and v is the signed measure with density 

dv := \S X _ + \S X+ - , 1(A-,a + ) dx. 

2iry—(x — X+)(x — A_ ) 

To structure of the paper follows the method of proof, which takes the following steps: 
Step 1. Prove a "central limit theorem" for polynomials; 
Step 2. Find the class of polynomials which diagonalizes the covariance matrix for the resulting Gaussian 

process; 
Step 3. Use concentration techniques to show that C^O, 1] linear statistics can be approximated by poly- 
nomial test functions in such a way that the variance of the difference of the two is small for all 
n. 
Step 4. Prove that the approximation works asymptotically. 

The rest of the paper is structured as follows: after a reparameterization of the model fSection ll.2[) . Section 
[2] covers Step 1 in the above "recipe" : show that the fluctuations are Gaussian when the test functions are the 
monomials. The proof extends the mechanism that was employed in [17] for the /3-Hermite and /3-Laguerre 
ensembles. In Section [3] we show that the limiting covariance is diagonalized in shifted Chebyhsev basis; the 
method employed is original and has to do with the generating function of the covariance matrix. Section [4] 
contains the proof that the matrix model satisfies the necessary conditions to apply the Anderson-Zeitouni 
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theorem. Section [5] contains the proof of Theorem 11.61 (calculating the deviation from the mean for analytic 
functions). Section [6] contains experimental results for the case that p — q = 1. Finally, we have included 
three Appendices. Appendix [Al which is the longest of the three, contains the symmetric function theory 
results necessary for the calculation of the deviation (Section [3]); more explicitly, it contains the proof that 
the series expansion of the functional J~(A) for monomial/ has a "palindromic" quality (the mechanism here 
is similar to the one employed in |17jV Appendix [Bl shows the existence of a Poincare inequality for Beta 
variables that is stronger than what can be proven using general log-concave theory. Finally Appendix [Cl 
shows a theorem of independent interest, which we proved in the course of an unsuccessful attempt to obtain 
our main result by a different approximation method: that "square root of beta" variables can be coupled 
to Gaussian variables in such a way as to have small variance. 

1.2. Reparameterization. While the parameters given naturally arise in the full matrix model (which 
exists only for (3 — 1,2,4), e.g., as the size ratios of the two Wishart matrices involved, we choose to work 
with a slightly different set of parameters for the purposes of this problem. Define parameters a and b by 

1 



a :- 



and b 



P 



p + q p + q 

As we shall see, a and b allow us to express the results in a "cleaner", perhaps more natural form. They 
expose symmetries of the asymptotics, which are invariant under the involution a *— >1 — b, 6 h- > 1 — a. 

For the regime of consideration of Theorem 11.31 the parameters a and b take on values in the triangle 
< a < 5, and a < b < 1 — a. The limiting spectral distribution will have support given by 



A+ = 



V&(1 - a) ± y/a(l - b) 



The reciprocal expression % appears frequently, with some terms having polynomial dependence upon it. 
Thus in the proofs we have used a in place of §. The Jacobi ensemble density, with these parameters, is 
expressed as 



(8) 



dfj,j(Xi,...,X n ) 



"[6_i] + i_i nTj-6 11 

Hx? [a J a (i-\i) a \. a \ a n 



A, — A, 



l<3 



The tridiagonal matrix model with these parameters is given A = B^B\ where 



(9) 



B a = 



u n s n —l 
— Sn-lC n _i C„_iS n _ 2 



~S n -2C n _2 C„_2S ra _3 



\ -sici Cl/ 

{ci, C2, . . . , c„, c[, c' 2 , ■ ■ ■ , c'n-i} mutually independent, 
^Beta(^ + a~ l {i - n), ^^ + a- l (i - n) and c- - W / Beta(a- 1 i, ^ + a- l (i -2n+ 1), 



= a/1-c? and s' = \ 1 - 



2. Polynomial Fluctuations 

2.1. Traces of Powers and Path Counting. When the linear statistic / is a polynomial, it can be 
computed explicitly using powers of the matrix model. By linearity, this reduces to the study of monomials 
tr(A fc ), and by the tridiagonality of A, there is a simple combinatorial expansion for this trace. In particular, 
these traces can be expressed in terms of certain lattice paths. In this section we will study these lattice 
paths and develop their combinatorial properties. We will use these combinatorial properties to compute the 
covariance of the limiting Gaussian process for polynomial test functions. Their properties are not needed 
for the proof that the limiting fluctuations are Gaussian. 
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Definition 2.1. An alternating bridge is a lattice path from (0,0) to (2k, 0) using only the steps (1,1), 
(1,0), and (1,-1) none of whose odd steps travel up and none of whose even steps travel down. Let Aik 
denote the collection of all such lattice paths. Likewise, let Ck denote the collection of all lattice paths of 
length k without the alternating property. 

Remark 2.2. These paths bear some similarity to the alternating Motzkin Paths which have been used to 
study the Laguerre Ensemble [16] . These paths differ in that Motzkin paths are restricted to stay above the 
a;-axis, while these are allowed to go above and below the axis. 

For a lattice path w starting at (0, k) with sequence of vertical coordinates {wq = k, Wi, tt>2> • • •} and an 
n x n matrix M, dchnc M^ to be the product 



M, 7 



II M 



M T 



U M 



W2i ,W 2 i+l -Mw2i+2 ,W2i + l 1 



i=0 4=0 

provided that all n < Wi < 1. If the lattice path w walks off the edge of the matrix, in the sense that either 
some Wi > n ot Wi < 1, then define M^ = 0. 

Example 2.3. A lattice path w and its associated product Mm. 

Provided the matrix M is at least 
6x6, this lattice path w would 
produce the product M tlI = 
M 6 , 5 M 5 T 5 M 5! 4M 4 T 5 M 5 ,5M 5 T 6 . 




Expanding the trace, 



trA k 



n 



B B B 



n p, 



The diagonal entries [(BpBj) ]i t i can be written in terms of alternating bridges, since for all 1 < i < n, 



B B B 



P D P. 



weA 2 k 



+i 



where w + i is the lattice path w shifted up by i. For convenience, dehne A2k,n to be all alternating bridges 
that are shifted up to start at coordinates between 1 and n; we will refer to these lattice paths as tridiagonal 
trace paths. In terms of these paths, we can write the trace of a power of a matrix as 

trA k = J2 A *- 

When n is large and k is fixed, each A^ is approximated by a substantially simpler quantity: every entry 
in a 2fc x Ik principal submatrix on the diagonal of A is strongly approximated by a deterministic tridiagonal 
band matrix (c.f. Lemmas 12.191 and I2.19[) . Thus, endow an alternating bridge with a weight by giving each 
horizontal edge weight x and each inclined edge weight y. Define the weight of the bridge to be the product 
of the weights of its edges, and define Pk(x, y) to be the sum of all the weights over all the paths in ^4 2 fc- If 
we let h(w) denote the number of horizontal steps taken by path w, then 



Pk(x,y) 



E 

weA2h 



x h(w) 2k-h(w) ^ 



We are interested in finding the exponential generating function for these pk, i.e. we will compute 



^(^Euft^y) 



k=0 



and show that 
(10) 



&>(t) = e t( - x ' 2+y2 h {2xyt). 



where To is the modified Bessel function of the first kind. 

These polynomials exhibit some nice combinatorial properties. Suppose that a path w £ Aik has i 
up-steps. Because the path returns to 0, it must also have i down-steps. Down-steps must be placed in 
odd positions, and up-steps must be placed in even positions; as a result, the placement of the up-steps is 
independent from the placement of the down-steps. Thus, there are exactly (-)(t_-) paths in A2k having 2i 
inclined steps. Note, this argument also shows that the number of inclined steps must be even. Consequently, 
the number of horizontal steps is even as well, and we have shown 



(11) 



Pk{x,y) = Y^ 



1=0 



,.-y (*-0 = y 2k 2Fi ( _ fc> _ K 1; (s/j,)**). 



_2J. 



For definitions and properties of the hypergeometric function 2 F\, see [TJ page 556]. As a consequence, we 
are able to compute the size of A2k by simply evaluating this polynomial at x = y = 1, 



\<A*k\=J2 



While the alternating structure naturally lends itself to describing traces of A, there is another way to 
view Aih which lends itself better to computing &(t). If w — w\W2 ■ • ■ W2fc-ii"2t, for steps Wi, then the 
concatenation of the steps W2i-iWsi is one of (2, 1), (2, —1) or (2, 0). Moreover, if it is either of the first two, 
then by the alternating structure, W2i-\W2i must have been (1, 0)(1, 1) or (1, — 1)(1, 0) respectively. If it was 
a horizontal step, then there are two possibilities, either (1, 0)(1, 0) or (1, — 1)(1, 1). 

Definition 2.4. By concatenating pairs of steps, alternating bridges w are in bijective correspondence 
with lattice paths in Ck whose horizontal steps are 2-colored. Let those horizontal steps corresponding to 
(1,0)(1,0) be colored red, and let those horizontal steps corresponding to (1, — 1)(1, 1) be colored blue. 

Example 2.5. Two alternating bridges with the overlaid Ck path. 
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Inclined Step. 



► Red Step,(l,0)(l,0). x x x Blue Step, (1, -1)(1, 1). 



Lemma 2.6. Let w £ A<zk,n be given, and define S™ (m) to be the number of times w walks from height m 
to height m + 1 or back, and let 5™ (m) be the number of times that w walks horizontally at height m. Both 
S 1 ™ (m) and S 1 ™ (m) are even. 

Proof. Let u be the colored lattice path from Ck that corresponds to w. Let v be the number of steps that 
u makes between height m and height m + 1 and back. Because u returns to its starting height, v is even. 
Let R be the number of red horizontal steps (i.e. those resulting from a (1,0)(1,0) pattern) that u makes 
at height m, and let B be the number of blue horizontal steps (those resulting from a (1, — 1)(1, 1) pattern) 
that u makes at height m+ 1. Because S™(m) = v + 2R and S™ (m) — v + 2B, both are always even. □ 

The correspondence between colored Ck and A2k allows the polynomials pk{x,y) to be represented in a 
third way. We will define the weight of an uncolored path p £ Ck to equal the sum of the weights over 
all alternating bridges w to which its colorings correspond. Suppose that an alternating bridge w is in 
correspondence with a colored path p, one with r red edges and b blue edges. Recall that h(p) is the number 



of horizontal steps the path takes, and therefore the weight of w is (xy) 



k-h{p) x 2r 2b 



There are 



(h(p)\ 



ways 



of placing the r red edges on the path (after which the placement of the b blue edges is determined). As the 
possible colorings of a fixed path p are in bijective correspondence with {1, 0} h ( p \ it follows that the sum of 



the weights corresponding to all different colorings of a given path p is (xy) k h ^ (x 2 + y 2 ) h ( p \ In conclusion, 
Pk{x,y) can be written as 

Pk (x,y)= J2^ k ~ hip) ( x2 + y 2 y iip) - 

pe£k 
The subset of the lattice paths Ck that fixes a given horizontal edge is in bijective correspondence with 
£fc_i, simply by removing the given edge. By inclusion-exclusion, it follows immediately that the lattice 
paths in Ck that have no horizontal steps are counted by 

fk\ (k\ fk\ f(fc) k even 

M>'-''n>"'-GK»i+ ■■■={;* kodi . 

The correspondence between Ck with a fixed horizontal edge and Ck-i decreases the statistic h{p) by exactly 
1, and so this inclusion-exclusion formula carries over to pk as 

Pk{x,y)- ^+j/ 2 )p fc _i(x,y)+ (ar + ir) 2 Pfc-a(», ») + ■•• = < 2 

vv w o fcodd. 



This recurrence can be recast in terms of the exponential generating function £P(t) to read 

l y 2 t 2 (2\ x 4 y 4 t 4 /4\ x^f 
2! (l) + 4! \2 + 6! 



~, ^ tt#±#\ x 2 y 2 t 2 (2\ x 4 y 4 t 4 /4\ £ 6 y 6 * 6 /6\ 



Thus, we have shown (jTOJ), 

0»(t) = e t{x2+y2) I (2xyt). 

Working with this function proves to be somewhat complicated, and it will be convenient to instead use the 
Laplace transform of 3^{t). Let L t [f (t)}(tu) denote the Laplace transform in the variable t 



oo 

-0)4 . 



L t [/(*)] (w) = / e-^f(t)dt. 
Jo 

When applicable, L st will denote the Laplace transform in both variables. The calculation of the Laplace 
transform of 3^(t) is simplified greatly by some elementary properties of the Laplace transform and the 
known Laplace transforms of modified Bessel functions. All of these properties are available for reference 
in [TJ Chapter 29] ; properties of the modified Bessel functions are available in [TJ Chapter 9] . The Laplace 
transform of the modified Bessel functions /„ is given by 

c" 1 

(12) Lt[J B (ct)](w) = - w ^====, u > c. 

(w + Vw 2 - c 2 ) Vw 2 - c 2 

If for some real value of wo, the Laplace transform is finite, then for any u in the half plane Sftw > u>q, the 
Laplace transform is finite. Further, the transform satisfies the following identities 

(13) L t [e kt f(t)](u>) = L t lf(t)](u>-k), 

(14) L t [tf(t)](u;) = -^-L t [f(t)](u). 

aw 

We will show that a priori, the Laplace transform of 3?(t) is finite in the half plane $Ilj > (x + y) 2 . This 
follows as I n (2xyt) satisfies the simple estimate 

< I n (2xyt) < e 2xyt , 

for t > 0, 2xy > 0, and thus 

0< 0>(t) <e t{x+v)2 . 
Identity (fT3")) makes computing the Laplace transform of 8?(t) a simple substitution into (TT2"1) . as 

Lt[5»(t)](w) - L t [e^ 2 +^/ (2zyi)]H = Lt[/o(2xyt)](w - a; 2 - y 2 ) = ' 



y(w — x 2 — j/ 2 ) 2 — 4x 2 y 2 
Using (fT4|) . it is possible to compute the Laplace transform of d x &(t), which arises later. 



Lemma 2.7. 



L t [d x <?(t)}(u) = 



2x(uj + y 2 - x 2 ) 



, uj>(x + yf 



I (w — x 2 — y 2 ) — 4x 2 y 2 ) 
Proof. This is a straightforward application of (fT5)l . (|14l) and the identity ioW — AW- 
L t [a x 5»(t)](w) = L t [2xte'^ 2+ ^)/ (2xyi) +2yie*( a;2+ ^)/ 1 (2 a; 2;t)]( W ) 

= -^L^ze^+^/o^) + 2ye t ( a:2+a2) / 1 (2 a ;yt)]( W ) 
2a; 2y 2xy 



= -d L 



where Cj is w — x 2 — y 2 . Thus 



^/uj 2 — Ax 2 y 2 \/u 2 — 4x 2 y 2 ui + \Jl>j 2 — ^x 2 y 2 
2x(u + 2y 2 ) 

,2 „2\ 

, u> (x + y) 2 . 



(£j 2 -4x 2 y 2 )2 

2x(uj + y 2 - x 2 ) 



I (lj — x 2 — y 2 ) — 4x 2 y 2 ) 



□ 



Remark 2.8. In a manner of speaking, we have circuitously arrived at the regular generating function for 
Pk(%-,y), since it is possible to deduce the generating function from the exponential generating function by 
way of the Laplace transform, as follows. Let & R (t) denote the generating function, 

oo 

£»*(<) = JVpfcfot/). 

fc=0 

The effect of taking the Laplace transform on an exponential generating function can be understood using 
the Gamma function. 

/>oo 

Lt[3»(t)](w) = / e- ut 0>(t)dt 



/ e~^Y,-nPk(x,y)dt. 



The order of summation and integration can be interchanged because vrPk(x, y) is always positive for t > 0, 
x, y € R, 

L t [^(*)](«) = 53/ e-^- Pk (x,y)dt. 
k=o Jo 

Make the change of variables s = wt, so that 

OO n ryQ If 

OO 

= ^2uj~ k ~ 1 p k (x,y)ds 

k=0 

= @> r {oj- 1 )w- x . 
Thus, putting everything together, 

&> R {lo) 



y(l — ui(x 2 + y 2 )) — 4u> 2 x 2 y 2 
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2.2. Asymptotic Normality of Fluctuations. We show in this subsection that polynomial test func- 
tions asymptotically have jointly normal fluctuations. This is the first component of Theorem 11.31 and we 
summarize the precise claim in the following proposition. 

Proposition 2.9. Let A be an n x n (3-Jacobi matrix, with parameters as described in Section U.ffl For 
any fixed k £ N, the k-tuple \X x i Al X x 2 Al . . . ,X x k A converges in distribution to a centered multivariate 
normal random variable. 

The method of proof will be the computation of the moments. Recall that a multivariate normal variable 
has mixed moments characterized by the Wick formula, which we will state precisely. 

Proposition 2.10. A centered random vector [Z\, Z%, . . . , Z}.) is a multivariate normal if and only if for 
each word m € [k] , the mixed moments satisfy 

„, tt 1 if I is odd, 

t & t {22aU{aM€£{G) EZ ^ Z ^ b if I is even, 

where the sum is over all graphs G that are perfect matchings on the vertices [k], and where £{G) is the edge 
set of this graph. 



To prove Proposition 12.91 it suffices to show that all the mixed moments asymptotically obey the Wick 
formula. Thus, our first goal is to show that the moments have the correct form. 

Proposition 2.11. For a fixed word m € [k] , 

ETT*>A = { 0(n ~ 1/2) if lis odd, 

km Xt " A \^GU{aMee(G) EX ^ m ^AX xmb:A + 0{n- 1 ^) if I is even, 

where the sum is over all graphs G that are perfect matchings on the vertices [k], and where £{G) is the edge 
set of this graph. 

This nearly proves Proposition 12.91 but it remains to show that the covariances have a limit. We will 
delay this proof as we will identify the limiting covariance explicitly, and we begin in the direction of proving 
Proposition 12 . 1 J1 In the sequel, fix some word m £ [k] . We will write the mixed moment indicated by m in 
a way that exposes its asymptotically relevant terms. The first step is to write the mixed moment in terms 
of tridiagonal trace paths. 



u£m u£m 



(15) = J2 ^HiA^-EA.^}, 



Wi,...,wi i—1 

where the sum is over all tridiagonal trace paths (wi, . . . ,wi) € A<2 mi , n x ■A2m 2 .n X • • • X ^2m,,n- 

Each nonzero random variable Ayj is a product of terms of matrix entries. More specifically, by Lemma 
trace paths visit each matrix entry an even number of times, and so A^ is a polynomial in the random 
variables {cf} and {(c^) 2 }. Thus for each tridiagonal trace path Wi for which A iBi ^ 0, it is possible to define 
random variables qj i with 1 < j < 2n — 1 so that 

(1) q™ i is a polynomial in c 2 for 1 < j < n; 

(2) qj+ n is a polynomial in (c'j) 2 for 1 < j < n — 1; 

(3) A^n-^f; 

(4) The smallest nonzero coefficient of each q™ 1 is 1. 

We will write q^* (x) for the corresponding polynomial in x, while when no argument is provided, we mean 
the random variable defined above. This decomposition breaks a random variable A tDi into a product of 
independent random variables. Further, each polynomial has the form q™ l {x) — x ai - j (l — x) bi - j for some 
non- negative integer powers. Note, however, that most of these polynomials are identically 1. 

li 



We will use these polynomials to alternately express the difference A^ — EA a . Specifically, we telescope 
in the following way. 



2n-l 



2n-l 



A m - EA m = f[ [{qf - Eqf)+Eqf] - [] Eq] 



3 = 1 



3 = 1 



(16) 



E 



SC[2n-l] 
S#0 



iw-^diw 



3&S 



3is 



In this last step we omit the empty set precisely because it is the term canceled by 'KA iBi . 

Note that in (|15p we require a product of I of these terms. Thus, by applying the (|16p multiple times, we 
can write 



(17) 



i i 

n [A m EA m ] = j2 n n ^t - e c ) n E <?f> 

i=l S 1 ...Sii=lj&S i jgSt 



where it is important to note that the sum is over nonempty subsets of [2n — 1]. 

In expectation, we will see that each difference term q^ — Eq^ that appears in the product contributes 
a factor of n~ x / 2 , and thus that the magnitude of (|17l) is at most 0(n~ 1 / 2 ). To show this, we require the 
ability to estimate moments of the terms that appear in the right hand side of (|17p . This is expressed in the 
following lemma. 

Lemma 2.12. Fix a polynomial q{x) = x ai (1 — a;)" 2 , and fix an n G N. There is a constant C = C(m, 01,(22) 
so that 



max 

Ki<n 



E\q{c 2 )-Eq{c 2 )\ m <Cn- m / 2 , and 



m^x E\q((c / l ) 2 )-Eq((c' i ) 2 )\ m <Cn-^ 2 . 

Ki<n— 1 



Proof. In the current parameterization, we recall that c 2 and (c^) 2 are mutually independent Beta random 
variables with parameters 

c 2 ~ Beta(^ + a-\i - n), -^ + o" 1 ^ - »)), and 
(cO 2 - BetaCcrH £ + cT^i -2n + 1)). 

The primary tool in this proof is the Poincare inequality for Beta random variables. From Lemma IB.ll a 
Beta variable X ~ Beta(pi,p2) satisfies a Poincare inequality 

1 



Var/(X)< 



4(pi + P2 



-E|/'(x)r 



for any Lipschitz function / on [0, 1]. Let M. denote the collection of all Beta variables appearing in the 
matrix model. We note that for all these variables, the sum of their parameters is at least — \— — 2] . By 
hypothesis on the parameters of the matrix, a < 1/2, and thus there is a constant C so that 



max sup 



xeM 



|/||iip<00 



Var/(X) 

E\f'(X)\ 2 



< 



C 



Further, by applying each of these inequalities to q(X) for any X € A4, we see that for any Lipschitz /, 

VMf(q(X))<-E\f'{q{X))q'{X)\ 2 . 
n 



Note that |g'(a;)| < (ai + a 2 ) on [0, 1], and thus 

Var /(«(*))< 



C(oi +a 2 f 



n 

12 



E|/'(g(X))r 



for all Lipschitz functions on the interval and any X € M. . It is well known that a Poincare inequality implies 
exponential integrability (see [10]). Precisely, 

\g(X)-Eg(X)\^i] 



. exp 



<2, 



D 



12(ai + a 2 )VC 
for every X € M. By expanding the exponential in its series, the claim follows. 

As a consequence of Lemma 1 2. 121 it is possible to estimate the contribution of any product of terms as in 

CI])- 

Lemma 2.13. There is a constant C = C(7,maxi<^<z mi) so that for any l-tuple (wi, . . . ,w{) € A2m 1 ,n x 

ETT' [^-EA^] <Cn- 1 ' 2 . 
Ai t= 1 

Furthermore, the dominant contribution is given by 



D, 



si...sj i=l 

with the sum over all l-tuples (si, . . . , s;) € [Z] , and 

I 
:]J[A m - EAat] - ED {iDi)i 



(q?;-Eq£)l[Eq] 



J^Si 



i=l 



< C r n -(I+l)/2_ 



Proof. We recall (fTT]): 

where the sum is over nonempty subsets Si C [2n — 1]. Taking expectations, most of these of summands will 
be 0. This is because for each word vii, there are at most Ami nontrivial polynomials g^, where Wi € Ai mil n- 
Thus, there are at most 2 4mi 2 4ni2 ■ ■ ■ 2 4mi nonzero summands of the form 

l 

(is) Psu...,s t :=Enn («f - e ^) n E «r ^ 

and thus it suffices to show the desired bound for an arbitrary term such as this. From each Si, pick an 
arbitrary jj. Each q" 1 is a random variable supported on [0, 1], and thus both \q" z — Eg™ 1 \ < 1 and lEg" 1 ' | < 1. 
Therefore, the term in (fT5)) can be bounded by 



Si,..., S ; 



<E 



\M:-H:) 



< 



1 l 



m 



where we have applied the arithmetic-geometric mean inequality. By applying Lemma I2.12[ we conclude 
that there is a constant C that depends only on maxi<K! m i an d I so that 

\P Sl ...., Sl \<Cn- 1 / 2 . 

Summing over all possible nonzero summands, the first conclusion follows. Note that the same argument 
shows that if cr := |5i| + l^l + ■ ■ • + \Si\ > I, then the same argument (with the same constant no less) shows 

\P Sl ,..., Sl \<Cn-°/\ 

from which the second conclusion follows. □ 

Having established these bounds, we introduce the notion of a dependency graph. 

Definition 2.14. For any tuple of tridiagonal trace paths (w\, W2, ■ • • ,Wi) , define the dependency graph Q 
to be a graph with vertex set [I] and i jh j if and only if A i0i and A a . are functions of mutually independent 
random variables. 
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The family of vector variables 

- := {{AwJizsfs ' 
where S ranges over all connected components of Q, is a mutually independent family of random variables. 
The importance of these connected components is that there are very few Z-tuples of tridiagonal trace paths 
that have few connected components in their dependency graph. Moreover, it is possible to estimate exactly 
how many trace paths have such dependency graphs. This motivates the following definition. 

Definition 2.15. For any x € {1, 2, ... , |_Z/2J }, let B x be the collection of all Z-tuples in Aim x ,n x A2m 2 ,n x 
• • • x A2m t .n whose dependency graphs have x connected components and no isolated vertices. For any such 
word tuple of words, let £ = £(v)i ,...,u>t) denote the edge set of the dependency graph. 

When I is even, B1/2 is the collection of all Z-tuples of trace paths whose dependency graphs are perfect 
matchings. With this definition, we can count the number of ^-tuples of trace paths having a particular 
number of connected components. 

Lemma 2.16. For any ^£N, there is a constant C = C(x,maxi<i<; rm) so that \B X \ < Cn x . 

Proof. This ultimately stems from the observation that there are only finitely many entries in the matrix 
that depend on a given entry. Thus, once any arbitrary trace path in a connected component has been 
chosen, the remainder of the trace paths must start nearby. Formally, we begin by bounding the number of 
ways to construct a connected component on s vertices. 

Without loss of generality, suppose these s-tuples are chosen from A2 mi ,n x A2m 2 ,n x • • • x A2m a ,n- As we 
would like choices having a connected dependency graph, we overcount by first choosing a desired spanning 
tree and then filling out the graph. As there are only s s ~~ 2 such spanning trees, we lose at most a constant 
factor. 



A 



2m\ ,n 



possible 



Let M = maxKKi TUi, and choose the first trace path in the tuple arbitrarily; there are 
choices for this path. Traversing the vertices of the tree in a depth first search, each vertex traversed must 
depend on the previously chosen path w prev € A2m prev ,n- This forces the choice of w new £ *42m„ c „,n to have 
that A^j nevi depends on A^ pr(!ij , and thus the starting point of w new must be no more than m new + m prev 
steps from the starting point of the previous. Thus there are at most 4M \A2m\ ways to choose the new 
path. This bound holds for every vertex explored in the depth first search, and we arrive at the bound that 
there are at most [AM \A2M\Y • n ways to choose trace paths having dependency graph spanned by a given 
tree. 

Summing over all possible partitions of I with x parts, i.e. all multisets of naturals {si} so that si + 82 + 

\- s x = I, and choosing components of these sizes for each, we arrive at the bound that there is a constant 

C so that \B X \ < Cn*. 

a 

It is now possible to identify the asymptotically relevant portions of an arbitrary mixed moment, and 
hence prove Proposition 12. Ill 



Proof of Proposition \2.11\ In terms of the notation B x , we recall (|15p and rewrite it as 

l 

ti£m wi,...,wi z—1 

LV2J 1 

(19) =Y, Yl K]]_[A^-EA m ], 

X=l (wi,...,wt)QB x i=l 

noting that this sum contains no Z-tuples of words with isolated vertices in their dependency graphs, as these 
vanish identically on taking expectations. By Lemma 12.131 there is a constant C\ sufficiently large that 



EjJtABi-EAtfJ 



< Cm- 1 ' 2 , 
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for every word in the sum. Also, by Lemma 12.161 there is a constant Ci sufficiently large that for all 
1 < X < 1/2, \B X \ < C 2 n x . It is immediate that if / is odd, then by (fl9j). 



E 11 X * k ,A 



[1/2] 

< J] C in - l / 2 C 2 nX = Oin- 1 ' 2 ). 
x=i 



If / is even, however, then applying the same bound to terms for which \ < V^- 



u£m (w 1 ,...,Wi)£Bi/ 2 i=1 

(20) = J2 II E [^ Q - EA^J [A^ - EA^] + 0(n- 1/2 ). 

(w*)*eB !/2 {a,6}G£ 

It only remains to show that the Wick word has the same form, i.e. it should be shown that 

(21) W:=$3 II ELIV^aX^a], 

G {a,6}e£(G) 

where G ranges over all perfect matchings of [I], has the same asymptotically relevant terms as (I2TJ1) . We 
recall (1151) . due to which we may rewrite 

W = J2 II E E [^ - KAffl.] [A»jfc - EAa t ] , 

G {a,6}e£(G) j; 2ma ,„xj; 2mti ,„ 

where the inner sum may be taken over all pairs of Z-tuples. For a fixed perfect matching G, every possible 
tuple (wi, . . . ,wi) is represented exactly once. After commuting the inner sum and the product, we may 
write 

W= ^ J2 II E[Ad.-E^b.][Ab 6 -E^»]. 

(wi,...,wi) G {a,&}e£(G) 

As before, we may ignore Z-tuples whose dependency graphs have an isolated vertex, and thus we write 

1/2 __ _ 

W = J2 E E II EIAffl.-EAoJtA^-EAfflJ. 

X=l(fi>i)iG-B x G {a,6}e£(G) 

We will bound the contribution of terms having \ < i/2, and we note that there is a constant C3 so that for 
any pairing G and any tuple of paths (ajj)j, 



JJ E [ A*. - EA™ a ] [A Wi -EA m ] 

{a,6}e£(G) 



< C 3 n-'/ 2 , 



which follows from applying Lemma [2.131 Writing C4 = (21)1/2 /l\ for the number of perfect matchings on 
[I], we have 

1/2-1 1/2-1 

E E E II \^[A iBa -EA iEa }[A Wb -¥.A ab }\< ^ C 2 nX-C i -C 3 n- l / 2 = 0{n-V 2 ). 

X=l (tDi)ieS x G {a,b}e£(G) x=i 

For each tuple of words (u)j)j € S;/2, there is exactly one choice of pairing G so that so that the product is 
nonzero, and thus 

W = E II E^-E^J^^E^JlOfu- 1 / 2 ), 

(wi)i£B l/2 {a,b}€£ 



which completes the proof on comparison with (l2Tfl) . 



□ 
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2.3. Computing the Covariance. We now turn to showing that all possible the pairwise covariances 
Cov(X x k A ,X x i^ A ) have limits and produce an expression for that limiting covariance. We will use Ck,i to 
denote the covariance we eventually show to be the limit. These covariances can be described in terms of 
the polynomials Pk(x, y) introduced in Section |2~T1 The exact form of the covariance is given by an integral 
against a parameter a. In terms of a, define the expressions 



(22) X =:= >/(» + *)(! -* + *) , and „ =:= & Z b + g£ + g . 

l + 2cr 1 + la 

The matrix C k ^ for k, I > 1 can now be defined by 

a f° 1 

(23) Ck.i ■= — / [(^xPfe^Pm + d y p k d y p m ) {1 - x 2 - y 2 ) - (d x p k d y p m + d y p k d x p m ) (2xy)] da. 

Remark 2.17. In this form, the integrand is separated into positive and negative parts. We can check that 
x 2 + y 2 < 1 for all —a < a < 0. Furthermore, because p k have all positive coefficients, x is nonnegative, and 
y is nonnegative, it follows that 

(d x p k d x p m + dyp k d y p m ) (1 - x 2 - y 2 ) > , and 
(d x p k d y p m + d y p k d x p m ) (2xy) > 0, 

for all — a < a < 0. To check that x 2 + y 2 < 1, we clear the denominator and expand the terms to show that 
this is equivalent to 

6(1 - a) + (1 - b)a < 1 + 2a + 2a 2 . 
The quadratic on the right is increasing for —1/2 < a < 0, and thus to show the inequality, it suffices to 
show that 

6(1 - a) + (1 - b)a = a + 6(1 - 2a) < 1 - 2a + 2a 2 . 
Using that 1 — 2a > and 6 < 1 — a, the inequality follows. 

Our primary purpose in this section is to prove the following Proposition. 

Proposition 2.18. For each fixed k, I G N, as n -> oo, 

CoY(X xk;A ,X xt>A ) = E [X xk;A X x i :A ] = C k ,i +0(n- 1 ' 2 ). 

Note that combining this Proposition with Proposition 12.111 we have proven Proposition 12.91 We turn 
immediately towards proving Proposition ^. 181 We recall that by (fl"5j). we have 

E [X x k A X x i rA ] = Y, E \ A ™* ~ EA ^] [Ae, - EAfl,,] ■ 

By Lemma \2. 161 there is a constant K x so that there are at most K x ■ n such words. Applying the second 
portion of Lemma T2.131 we have that there is a constant K ky i so that 



E |Xj,fc ^X,,!, J — y _ ED( ffi u, ) 

' — <w k ,wi 

where we recall that Duj) k ,wi) i s given by 



< if x n • K kv i ■ n 



-3/2 



JWo = £ II [(C- E C)]X 



sfc,«! ie{fc,i} 

with the sum over all choices of s k , si € [2n — 1]. Thus, it suffices to analyze the quantity ^ ffi lTl EZ?^ ^ 
and show it has the desired limit. Note that by the construction of qf* , each of q™J° and qf l are independent 
if Sk =/= Si, and thus we have 

2n-l 



E[x xktA x xl , A ]=j2- - £ i[«r-igr][gf -%r] fn^ E C E «f 

L J * — 'w k ,wi * — * L J L J L- 1 - J^* 



0(n 



-i/2i 



We define r t so that 



(24) r t := V E [g? * - Eg? *] [g? ' - Eg?<] [[] EgfEg 
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and note that by commuting sums in the previous equation, we have 

2n-l 

(25) E [X xhiA X x i >A ] =J2 rt + °( n ~ 1/2 )- 



Let {zi}fli 1 be the enumeration of all the Beta variables in Ai, where 



(ci) 2 for 1 < i < n and 



z% = (<^_ n ) when n+1 < i <2n—l. This makes each q™ z a polynomial in Zj, The first step in the analysis 
amounts to using Taylor approximation to pull the expectations inside the q™ 1 polynomials. 

Lemma 2.19. There is a constant K — K(k, I) so that for all 1 < t < In — 1, 

\r t \<Kn-\ 
Moreover, it is possible to identify the dominant contribution rf , which is given by 

r?:=Var(* t )£- _ \q?"'(Ez t j\ \q?>'(Ez t j\ [j] .. qf k (Ez^qf (E Zj ) 



and which has 



\r t 



rP\<Kn' 3 / 2 . 



Proof. The first claim follows from Lemma 12.121 and from the fact that the number of trace paths that 
depend on z t is bounded by some K = K(k,l). The second claim will follow from Taylor approximation. 
For any polynomial qj k or q™ 1 , it is possible to bound the maximums of the derivatives over [0, 1] in terms 

of k and 1. Each polynomial has the form q 1 "' (x) — x ai ^ (1 — x) ai ^ , and hence its first and second derivatives 
can be bounded by a\ ■ + a? and (a- _■ + a 2 ) 2 . These parameters a 1 and a 2 can in turn be bounded by i, 
to yield 



max 

xG[0,l] 



< Ai and max 

xG[0,l] 



< (4*) 2 , 



for either i £ {k, I}. These imply that the th order approximation has error 
and the I s * order approximation has error 

We recall the definition of r t , which was given by 

z — 'w k ,w t L-*-- 1 -. 



< 8i 2 |^-E«3| 2 . 



Eq^EqT 
jjit ■> J 



(0 

Using I s ' order approximation for term (j), we bound 



(H) 



(26) 



D (i) := 



E [gf * - Eqf k ] [qf> - Eqf] ~ E \qf "' (Ez t )qf >'(Ez t )(z t - Ez t ) 2 l < K^' 2 



with the constant implicitly depending on k, I, and the constants assured by Lemma [2.121 Using the th 
order approximation for term (ii), we will bound the difference between (ii) and its approximation. This 
will be done by replacing each zj by Ezj one term at a time. As there are at most 2k + 21 non-constant 
polynomials q l " k and qj' , this reduces bounding (ii) to bounding, for any fixed u, 

A u := [C* (Ez«)C (Ez u ) - Eq™* (z u )Eqf (z u )] ]J qf (E Zj )qf l (E Zj ) ]J Eqf* ( Zj )Eqf (z 3 ). 

j<u j>u 

j^t jjtt 

Recalling that all q™ % are almost surely less than 1, this can be bounded by 

|A„| < \q^{Ezu)q^{Ezu)-Eq^(zu)Eq^{z u )\ < (4fc + Al)E\z u - Ez u \ < ^n^ 2 . 
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These bounds applied to the difference of (ii) and its approximation show 



(27) D (ii) := I]! Eqf» ( Zj )Eqf ( Zj ) TJ qf» (E Zj )qf (E Zj ) 



h& 



h^t 



< 



l<u<2n-l 
Qu k Qu l 7^1 



|A„| < (2k + 21) ■ K 2 n- 1/2 . 



By combining Lemma 12.121 with Cauchy-Schwarz, one has that (i) is at most K$n . Therefore, we can 
combine both of (|21)|) and (j2"7| to show 



kt-rfl < 



E, 
^E, 



IWI \ D a 



\D 



«l 



[n 



^#>(E*j)g?'(E*,-) 



i^n" 1 • (2fc + 21) ■ K 2 n- 1/2 + K in - 3/2 ■ 1. 



As the sum is only over paths that depend upon t, the proof is complete. 



□ 



All the expectations in r® are approximately equal to one of two values, Ezt and Ezt+ n (or t — n in the 
case t > n), on account of the trace paths being forced to overlap. Thus, this can be expressed in terms of 
the polynomials pk(x,y) for values of t for which the trace paths are sufficiently far from the matrix edge. 
The values of x and y are given in terms of the expectations of matrix entries. Put s(t) = t if 1 < t < n, 
and put s(t) = t — n if n + 1 < t < 2n — 1. The values of x and j/ are given by 

(28) x(t) := VIEK(1-( C ' S ) 2 )] and y(t) := V^E[(ci) 2 (l - (c.) 2 )]. 

Note that these are not exactly the expressions for x and y given in (|22|) . but we will show that these two 
quantities are strongly related. In what follows, we unequivocally mean the x and y given in 



Lemma 2.20. Define ^ to be 

6 



i (\rar(n 2 \ \ f xd x p k (x,y) ydyPk (x ,y) \\ ( xd 0: p m (x,y 

D i I Var(cj N — g^ i-ec^ JJ ^ e^ — 



) _ y9yPm{x,y) 



1-Ec? 



u„ r l'j2\ |"( V d vPk(x,y) xd x p k (x : y) ^\ ( yd y p m (x,y) xd x p m (x,y) \ 
Var l c s J [^ ESp i_ E (^)2 J J ^ 15? l-E(c'J^ J 



1 < t< n 

n+l<t<2n-l. 



There is a constant K — K(k, 1) so that for allk + l<t<n— k — I and n + k + I < t < 2n — k — I — 1. 
\g\<Kn-\ 

Proof. We show the proof for 1 < t < n. The proof for t > n is identical. We recall that rf is given by 



n 



Var(z t )E- _ \qt k '^z t )] \qf l '(E Zt )] \Y[af" (E Zj )qf (E Zj ) 

z — 'w k ,wi L J L J L^^-j^t J J 



This splits nicely as r]r = Var(z t ) M t (wk) M t (wi) , where we define 



M t (wi) 



e* [^'H [iwm 



This M t (wi) is essentially computable from just two expectations, Ez t and Ez t+n . Letting 



M t (v)i) 



D -E. 



U; L 



n 



; (Ez t )C„(E^+n) 



l<j<n 



we show that l-M^w;) — M/ 3 ^) is 0(n 1 ). We will require the formulae for Ez t = Ec 2 and Ez 4+ „ = E(c' t ) 2 , 
and so we recall the precise distributions of these entries, 

cj ~ Beta(^ + a-^i - n), =^ + a -i(j _ „)) and ( c ^) 2 ~ Beta^ 1 *, £ + a" 1 ^ - 2n + 1)). 

Their expectations are given by 



nb 



(29) 
(30) 



E Zt =Ec/ t =^ 



|+a _1 (t-n) 6-a + 2a; 



s-+a- 1 (2t-n) l-2a + 2a^ 



Ez t+n = E(c' t ) 2 = 



orH 



-?- + a- 1 (2t-7i) 
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1 -2a + 2a^ 



Each of these expectations, as a function of t, is uniformly Lipschitz continuous over < t < n with constant 
K\ ■ rtT x for some K\ depending only on the ensemble parameters. By the same method used in the proof 
of Lemma \2. 191 it is straightforward to show that there is a constant K2 = K2(k, I) so that 

\M t {wi)-M t D (wi)\ <K 2 n-\ 

We recall the notation of Lemma 12.61 where we defined S™* (t) to be the number of horizontal steps of Wi 
from level i to i and S™' (t) to be the number of steps of Wi from level i to i + 1 or vice versa. The polynomial 
q™* may be identified precisely in terms of these counts. Recalling the matrix model (|9]), the variables Ct 
and St appear only in the t row from the bottom of the matrix. It follows that 



irtf) 



*C*) (1 _ ^VW/^ 



and thus, differentiating, 



It ( z ) = 



2 2 

S*'(t)q?(z) Sf>(t)q?*(z) 



(31) 2z 2(1 -z) ' 
We now relate M^{wi) to expressions containing Pi(x, y). The essential realization is that 

(32) E-<^(*)[IT gf(E^)?;+JEz i+ „)l=^ h{w)x h ^y^ h ^=xd xPl {x,y), 

where h{wi) is the number of horizontal steps wi makes, and x and y are defined earlier. This is a direct 
consequence of the bijection between paths w\ £ A21, n that have a single marked horizontal edge at level t 
and paths W2 € A 2 i having a single marked horizontal edge. This is given by the map that simply vertically 
shifts w\ to start at 0; note that this is invertible on account of the mark being forced to lie at level t. For 
n— k — I >t> k + l, every summand on the left hand side of (|32|) is exactly the summand given on the right 
when identifying paths via this bijection (note that for t too close to the matrix edge, some of the paths on 
the left hand side will be 0, destroying the identity). Similar reasoning shows 



(33) 



E, 



S 



f (*)[!! 



<j<n 



QT (^Zt)qJi n (Ezt +n ) = yd y p t (x, y) 



By combining (|3T|). ([32]), and ([33]), it follows that 

xd x p t (x,y) yd v p % {x,y) 



"?<&>- 2Ez t 2E[1-^]' 

The conclusion of the lemma follows more or less immediately. By Lemma 12.121 the variance of z t can 
be controlled by K$n~ , with K3 depending only on the matrix parameters. The moduli of M t (wi) and 
M t (wi) D can be controlled by some K4 = K&{k, I), and so 



|f t | = \V&T(z t )M t {w k )M t (wi) 
completing the proof. 



Var(z t )M t D (w fe )M t D (wi)| < K^ 1 ■ 2K 4 ■ K 2 n 



□ 



On account of the variance being of the order of n~ l , summing these expressions takes the form of a 
Riemann sum. We thus conclude the proof of the limiting covariance formula by showing that this Riemann 
sum converges to the integral given by Ck,i- 



Proof of Proposition \2.18\ By Lemma 12.191 and (J5SJ) , 



JlL Jv , r fc 4 J\ 



k ,A A x l ,A 



E2n-: 
t=l 



+ 0{n- 1 ' 2 ) 



For 1 < t < n, Lemma \2. 201 shows that 



(34) E^= E 

t=l t=k+l 



Var( C 2) 



xd x p k (x,y) yd y p k (x,y)\ fxd x p m (x,y) yd y p m (x,y) 



Ec? 



Ec? 



+ 0{n- 1 ' 2 ). 



1!) 



We will show that the variance of these Beta variables is of order n _1 . To concisely describe the integrand 
that results in the limit, put r to be the variable over which the integral is taken, and define e(r) and e'(r) 
as 

b — a + 2ar , , , , ar 

35 e r := : ^ 9 and e'(r := ; 9 ^ 9 < r < 1, 

1 — 2a + 2ar 1 — 2a + 2ar 

so that for r = t/n, e(r) = Kc 2 and e'(r) = E(cj) 2 (see ([29])). We will reuse the notation x and y by putting 



(36) x(t) := Ve(r)(l-e'(r)) and y(r) := v/e'(r)(l - e(r)). 

This definition is now consistent with (f2"2"|) . after making a change of variables. We recall the variances of 
these Beta variables, 

37 Varc? = ^ a J_L_2 s L_ = _"A Z_^M ™J + Q ,-2) 

n(i + 2i_2) 2 (i + 2*_ 2 + s) n (l-2a + 2 a l) 3 

Van /Van n / V n / 

Var(4)* = a - (M4±A^ = -(^)( 1 - 2a + f + 0(n - a)| 

Mi + f-2) 2 (i + f-2 + ^) » (l-2a + 2a l) 3 

where we may choose the constants in the error terms to depend only on the ensemble parameters (and not 
i). By virtue of the — factor, the sum Y^it=i r ? takes the form of a Riemann sum. The integrand, exposed 
on the right hand side of (|34|). is Lipschitz continuous in t/n, and thus the convergence of the Riemann sum 
to the integral occurs with rate OinT 1 ). This shows 
(38) 
■A D _ aa f 1 (b - a + ar) (1 - b - a + ar) f xd x p k _ ydyPk \ ( xd x p m _ yd y p m \ . _ 1/2 

S* ' 4 J (l-2a + 2a T f V e(r) l-e(r)J\e(r) l-e(r)J 

Applying the same reasoning to n + 1 < t < 2n — 1, it follows that 

J? +1 4 J (l-2a + 2ar) 3 \e'(r) 1 - e'{r)J \ e'(r) l-e'(r)J l ; 

The sum of these two integrals (|38p and (j3"9")l and the associated error bounds show that the limiting covariance 
exists, and their sum provides an expression for the limit. The remainder of the proof will show that this 
expression can be alternately expressed in the form given by C^,i (defined in (|23|)). The primary difference 
is a change of variables. Take a — a(r — 1). The integrals become 

(40) f r? = t f° e(gKl ' e( " )) (^ - r^-r) (^ - P^r) da + 0{n^ 2 ) 

ti ^J-a (1 + 2(t) \ e{a) I - e{a) ) \ e{a) l-e(a)J ' 

(ai\ v' d a f° e , (o-)(l-e , (cr)) f ydypk xd x p k \ ( ' yd v p m xd x p m \ _ 1/2 

The sum of these integrals can be shown to equal C k: i by checking the coefficients in front of the terms 
d x pkd x p rn , OyPkdyPm, d x p k d y p m and d y p k d x p m . The coefficient on d x p k d x p m in the sum of the integrands 
(|5D]) and (gTJ) is given by 



e(tr)(l - e(a)) x 2 e'(a)(l - e'(a)) 



(1 + 2.) e{a) 2 (1 + 2a) l~e'(a) 2 



[1 - e(a)][l - e'(a)} + e{a)e'(a) 1 - x 2 - y 



2 



1 + 2cr 1 + 2a 



Similar manipulations show that the coefficients on each of the other terms agree with the coefficients in the 
integrand of Ck,i, completing the proof. □ 

3. DlAGONALIZING THE COVARIANCE MATRIX 

We proceed by showing that the covariances are diagonalized by the appropriate Chebyshev polynomial 
basis. This will be done by verifying that certain generating functions agree. We would like to show that 
the infinite covariance matrix can be decomposed as 

C = LKL\ 
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for the diagonal matrix A = diag(0, 1,2,3,4, . . .), and some lower triangular matrix L. The L n _k entry of 
this matrix is the coefficient of the k th Chebyshev polynomial Tk(x) in the expansion of x n . Define the 
exponential covariance generating function ^(s, t) as 



k,l>0 

and define the exponential generating function of LAL f analogously, 

k,l>0 

We will show that these generating functions are equal by computing their bivariate Laplace transforms and 
showing they are the same, from which it follows that C — LAL f . 

3.0.1. Computing L S]t [iT]. The coefficients L n ^ can be computed by a recursive formula, but they have a 
useful Fourier-like expansion. Define 9 in terms of x so that 

cos(6>) = 



A+ - A_ ' 
from which it follows that 

'2x- X- - A+ 



2cos(n6>) = 2T n (cos0) = 2T n 



A+-A_ 



Expand \e tx as a series in t, 



oo ,, n oo k 



n=0 ' n=0 ' n=0fc=0 ' n=0 k=0 

where we have used the definition of L n ^ as the coefficient of the k th Chebyshev polynomial in the expansion 
of x n . 

The Fourier interpretation allows for the matrix multiplication LAL f to be carried out by an integral. 
Consider the kernel Km{0, 4>), which will formally play the role of A, given by 

N 



Kpf(6, <f>) — 2_\ k cos k9 ■ cos / 



k=l 

This allows for £? to be given by 

3T( s ,t)= lim —j / / e V 2 + J ' K N (0,cf>)e 

as the coefficient on t k s l would be 

J 



lim — / / > Lf~ a— cos j0 > k coskO ■ cosfc<^| > L; 7 — cosj</> d(j)dd, 

^oar 2 j j ^ ■ fci y ^ ; i^' ■■ /i i 



which by the orthogonality of {cos j8}J^ on [0,7r], is exactly (LAL*)^; when N > min(fc, I). Further, these 
integrals can be evaluated, as the expression e zcos0 has an expansion in terms of Bessel functions. Namely, 

oo 

e zcos6 = I {z) + 2J2 h(z) cos k6, 
fe=i 
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(see [TJ p. 376]). This defines the Fourier coefficients of e zcose , from which it follows that £?(s,t) can be 
rewritten as 



oo 



rt cos 8 tv- i 

e K N ( 



fc=l 



where c = 



c = (^-) and r = (±£=) 



Again, we will require the Laplace transform of this generating function. Each summand klk{rt)lk{rs) is 
positive for s, t > 0, and so commuting the sum and the Laplace transform is justified. 



L S)t [^( S ,i)]07,w) =5>L S]< [e c (* +s )4.(ri)/ fe (r S )l fow) 



fe=l 

oo 



E^TT 



£l " (w + y/G} 2 -r 2 ) k v 7 ^ 2 - ^ 2 (»? + V»7 2 - 7-2 ) fc Vv 2 - r 2 ' 
where ui = 10 — c. This has the form for the series expansion of ,.^ . a . After simplifying, this expression is 



(42) 



L. lt [^(«, *)](»/,«) = 



•\/?7 2 — r 2 \Juj 2 — r 2 yJ(Cj + r)(fj — r) + y/(ui — r)(fj + r) 



2 • 



3.0.2. Computing L Sj t[^]. We will now turn to computing the Laplace transform of %?. The integrand of 
Cfc,z is not positive, but it can be split into two integrals whose integrands are positive (see Remark 12. 17p 



with 



and 



J k,l 



Rk,i - j 



l 



1 + 2(7 



1 + 2(7 



Ck,i — Lk,i — Rk,l, 
[{d x Pkd x p m + d y p k dyp m )} da, 

[(dxPkd x p m + dyPkdyPm) (x 2 + y 2 ) + {d x p k d y p m + d y p k d x p m ) (2xy)] da. 



As pi(x,y) has all positive coefficients, and both x and y are positive on the domain of integration, each 
of these integrands is positive. Defining generating functions for each array, 



k A 



s"t 



s k t l 



^M) = E un Lk > 1 and^(M) = J2 un Rk > l > 



we can write 



ii 



k,l>0 



1 



k,l>0 



k\ l\ 



(M)= 4./_ Q l-2o 



(43) 



#(«,*) 



<\ 



ii 



1 



[{d x ^>(s)d x ^(t) + d v 0>(s)d y &{t))] da, and 
[(d x ^(s)d x ^(t) + d v &{a)dy&(t)) (x 2 + y 2 ) 



4 7_ a l + 2(7 
+ (d x ^( S )dy^(t)+d v ^{s)d x ^(t)) (2xy)]da, 

where we have commuted sum and integral by the positivity of the integrands. Recall that &(t) — &(t, x, y) 
is the exponential generating function for the polynomials Pk(x, y), and from (|10p . it is jointly analytic in all 
variables. As —a > —1/2, it follows that the integrands are continuous for all — a < a < 0, and all s,t. In 
particular, each of J>f and £% is finite for all s, t, and it follows that we can write ^(s, t) as the sum of these 
two functions, so 

¥(8,t) =5f{s,t)-.%{ S ,t). 
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The joint Laplace transforms in s and t will be computed for both of these expressions. This makes heavy 
use of Lemma 12.71 Additionally, it requires that the order of integration be switched, which requires an 
argument. We prove a simplified statement, by whose method it is easily seen that these integrals can be 
exchanged. 

Lemma 3.1. Suppose that uj > A + and that 77 > A + , then 

—d x &(s)d x 0>(t)dadadt= / / —d x ^>{s)d x ^{t)dcjdsdt 1 

-a 1 + ^cr J-aJs,t>0 1 + ^ 

and each is finite. 

Proof. We begin by maximizing x + y over a € [— a, 0], where it is seen that the maximum is attained at 
<t = 0, at which point, 

(x(0) + y(0)) 2 = ( jbiy^a) + y/a(\ - b)) * = A+. 

Thus, it follows that (x + y) 2 < A+ < to for all -a < a < 0. Recall that &>(t) is given by e t{x2+y2} I (2xyt), 
and thus 

d x 0>(t) = 2xte t(x2+y2) I {2xyt) + 2yte t{x2+y2) h{2xyt). 
Using that < I n (2xyt) < e 2xyt for all n, it follows that 

0<d x &>{t) <2{x + y)e t{x+y)2 , 
for x, y > 0. It follows that there is a constant C so that for all —a < a < 0, 

< C - - d x &»(s)d x 0>(t) < C(x + y) 2 e-^- ix+y ^ t -^- (x+y ^ s \ 
1 + 2o~ 

Using the bound oni + j; derived above, 

< —d x 0'(8)d x & , (jt) < c*A^e- (t "- A+) *- ( "- A + );i) . 

Thus, provided that oj > A + and r\ > A+, the order of integration may be reversed by Fubini. □ 

We can now compute the bivariate Laplace transform of ^(Sjt). 
Lemma 3.2. 

L s ,t[ < ^'(s,i)](7y,cj) = - / 3_ — I _dp, 



J(l 2a) 2 ^ -p 2 p)2 (r x - r 2 p)' 2 



where these parameters are given by 



+ 



m = -wq[(l - A_ - A+) 2 ] - (w + r?)A_A+(l - A_ - A+) - (1 - A_ - A+ 4- 2A_A+)A_A 
n 2 = -U!rj[l - A_ - A + + 2A_A + ] + (w + 77 - 1)A_A + 
Pi = w(l — A_ — A + ) + A_A+ 

P2 = W — W 

r a = 77(1 - A_ - A+) + A_A+ 
r 2 = r? - if . 

Proof. We start by commuting the integration in a and the Laplace transform in (|43[) . To evaluate these 
Laplace transforms, we recall Lemma [2~7l where the Laplace transform Lit[d x & (t)] was computed to be 

T m ^,, m, n 2x(uj + y 2 — x 2 ) . , 9 

U[d x 0>(t)](w) = i y - '- T , u > {x + yf. 



The quantity x — y simplifies to 













3 


((«- 


x 2 — 


V 


f- 


- Ax 2 y 


2 Y 


z 2 


2 


= 


b — a 
l + 2a 
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L f[W )]H = 2,(l + 2^Kl + 2.) + (6- a) ) _ u > ^ 



The Laplace transform of d x £P(t) can be rewritten as 

T r* a»u\\t \ 2x{l + 2crf^{l + 2cT)-{b-a)) 

((lo 2 - w)(l + 2cj) 2 - (1 - 2o)(l - 26)w + (6 - a) 2 ) 3 

for a - g [—a, 0]. By symmetry, the Laplace transform d y £P(t) is 

w 

{{u 2 - w)(l + 2ct) 2 - (1 - 2o)(l - 26)u; + (6 - a) 2 )2 

Define A(w) to be 

A(w) = ((w 2 - w)(l + 2cr) 2 - (1 - 2a)(l - 2b)uj + (b- a) 2 ) , 

and define p — (1 + 2<r) 2 . We will now split the computation of ^(s,t) into two pieces for simplicity's sake. 
The first piece is 

L s>t [d x 3*(s)d x &>(t) + d y &(s)d y &(t)] ( V ,u) 

2 (x 2 + j/ 2 )[^ W + (b - a) 2 } - (b - a) 2 (uj + r,) 

~ ^P 3 3 

A(w)2A(77)2 

The second piece is 

L,, t [d x &{s)d y &{t) + d y &>{s)d x &{t)} fa,w) 

8xyp 2 [iL>rip — (b — a) 2 ] 

3 3 ' 

A(w)2A(t))2 

Combining these two pieces, 

/■« p 2 [((x 2 + y 2 )(l - z 2 - y 2 ) - W)W + ((z 2 + 2/ 2 )(l - * 2 - 2/ 2 ) + 4*V)(& - a) 2 ] 
L S]t [<*fj(7?,u>) = a / 3 g 

J - Q ^oA(w)2A()j)2 

p2[_( b _ ffl) 2 (cj + 7?)(1 _ a .2_ y 2 ) ] 

H fc g 3 -do-. 

^A( W )2A(^)2 

We simplify some of these expressions, 

{ (x"+y")(l-x 2 -y")-4xV) 

= \p- 2 [(-4o(l - a) + 1)(-46(1 - 6) + 1)] + ip" 1 [1 - 26(1 - b) - 2o(l - a)] , 
((x 2 +y 2 )(l-x 2 - 2 ; 2 )+4 a; V) 

= ip" 1 [p - 1 + 26(1 - b) + 2o(l - a)] , 

i 2 2 

1-X -J/ 

= ip- 1 b+(l-2a)(l-26)]. 
After changing the integration to be over p, we produce the desired formula. □ 

We will explicitly evaluate the integral in Lemma [3?2l to conclude that 

Lemma 3.3. 

d 2 



L Sit [tf(s,t)] = a- 



y/{Q + d)(fj -d) + y/(ui - d)(fj + d) V^^V^d 2 



where d = I — *-^ — I , u = Ui — ( A+ ^ A ) and f) = r\ — 

By comparing with the expression for L S]t [^(s, £)] derived in (|42[) . this lemma completes the proof of the 
diagonalization of the covariances. 
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Proof. Differentiating both sides, it can be shown that 

ni-n 2 p _,_ -2 2n 2 piri - 11x^^2 + p 2 ri) + p(2n 1 p 2 r 2 - n 2 (pir 2 + p 2 ri) 



(pi-toPihn-rtp)* Wa-W 



1 l 

(Pi ~ Pip) 2 (n - r 2 p) 2 



The indefinite integral can be greatly simplified, plugging in some of the n,p, and r terms. 



Hi - n 2 p , 

1 T d P = 

(pi - Pip) 2 (n - r 2 p) 2 



2((1 - A_ - A + )(/7 + uj) + 2A_A+) - 2p(w + ij - 2ur)) 

I I ' 

(r) - uj) 2 (pi - p 2 p) 2 (n - r 2 p) 2 

The antiderivative will now be evaluated at both endpoints. At p = 1, it becomes 

2il>?7- (u + 7?)(A_ +A + ) + 2A_A+ 
(77 - W ) V(" - A_)(o; - A+) v ^ _ X_)(77 - A+) ' 
To evaluate at p = (1 — 2a) 2 , it is helpful to work with a and b instead of X±. Using the formulae 



A_A, 



(b - a) 2 and A_ + A + = 2(a + 6 - 2ab), 



the antiderivative evaluated at p = (1 — 2a) 2 is simply 



(tj — uj) 2 
At last we can give a single expression for the Laplace transform of the covariance function: 



Recall that r = ( + 2 J , w = cj - 
these modified parameters to get 



y/(u> - X-)(v - A+) - y/(u> - \+)(r, - A_) 



4 (, ? - w ) V( w - A_)(w - A+) y/(r, - X-)( V - A+) 



L., t [ 



,*)] = 



and fj = rj- 

d 2 



y/(w + d)(fj -d) + y/(u - d)(fj + d) Vw 2 -rf 2 VJr^ 



We rewrite this expression in terms of 



D 



4. Extension to Continuously Differentiable Test Functions 

We learned the idea for the extending the CLT from the appendix of Anderson- Zeitouni [2,- Roughly 
speaking, one would like to extend a CLT for polynomial test functions to a CLT for a larger class of 
functions, the hope being to invoke the density of the polynomials. However, it needs to be assured that 
error-in-approximation produces small error in the fluctuations when evaluated on the empirical process. The 
property of a matrix ensemble that allows one to execute this is a type of global concentration of eigenvalues. 
See also Proposition 11.6 in [2J and Lemma 1 of |4T| for related approaches. 



Proposition 4.1. Let {A n } be an ensemble of matrices with compact spectral support S, and let V : C 1 ^) 

Lip 



be a postive semidefinite quadratic form for which there is constant C\ so that V(f) < C^H/H 2 ^ for all 



f G C 1 (S I ). Suppose that {An} satisfies a polynomial-type CLT, i.e. for all polynomials g, 

tr.g(A„) - E tr. g(A n ) =► N(0,V(g)) 

and additionally V&r tr g(A n ) — > V(g). If the ensemble satisfies a Poincare type concentration inequality, i.e. 

(44) Var(tr/(A0)<C 2 ||.f|| 2 lp . 

for some constant C 2 independent of n and any Lipschitz f on S ' , then the polynomial CLT extends to all 
C 1 functions f : S — > R, as 

tr f(A n ) -Etr f(A n ) ^ N(0,V(f)). 
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Proof. We recall the quadratic Wasserstcin metric 

W 2 {^v) 2 =iniEpf -Y) 2 , 

with the inhmum over all couplings [X, Y) with marginals /i and v respectively. For a random variable X, 
we let LX denote its law. It is well known that W 2 (CX n ,LX) -> if and only if X n => X and EX 2 -> EX 2 
(see Theorem 7.12 of ^50J) . For any / £ C 1 (S), let Zf denote a centered normal random variable with 
variance V(f). Thus for any polynomial g, W 2 (C(tr g(A n ) — Etr g(A n )), CZ g ) — >• 0. 

Let / be any C 1 (S) function. By Weierstrass approximation of the derivative of /, there is a sequence of 
polnomials p k so that ||/ — p k \\up — ► as k — > oo. It follows that V(pk) — > V(f) from its continuity with 
respect to the Lipschitz scminorm, and hence that W 2 (CZ Pk , CZf) — ► as k — > oo. For any k we can bound, 

W 2 (£(tr f(A n ) -Etr f(A n )),£Z f ) < 

W 2 (£(tr f(A n ) -Etrf{A n )),£(tvp k (A n ) - Etr Pk (A n ))) 
+W 2 (C(tT Pk {A n ) - Etr Pk {A n )),CZ Pk ) 
+W 2 (£Z Pk ,£Z f ). 

By the concentration inequality, it is possible to bound 

E [tr f(A n ) - Etr f(A n ) - tr Pk (A n ) - Etr Pk (A n )] 2 < C 2 \\f\\ 2 Llp , 

from which it follows that W 2 (£(tr f{A n ) - Etr f{A n )), C(trp k (A n ) - Etip k (A n ))) < Ci||/|| Ljp by the 
definition of the Wasserstein metric as the infimum over couplings. Likewise 



W 2 (£Z Pk ,£Z f )= ^n^-^/W) <VV(Pk-f)<C 2 \\f\\ Lip . 
Therefore, from the polynomial CLT, 

limsupW 2 (£(tr/(A n )-Etr/(^ n )),£^ / )<(Ci + C 2 )||/-p fe || iip . 

n— >-oo 

Taking k — > oo completes the proof. □ 

Note that the moment-method proof used for the polynomial CLT implies Vaiti(g(A n )) — J> V(g), and that 
the bound of V(f) < C||/||lj p follows from Remark ll.4l To show that linear statistics of the Jacobi ensemble 
satisfy a Poincare inequality, we will work directly with the joint eigenvalue density function. Recall (|5J|, 
which stated 



1 TT.ala- 1 ^- 1 ^ . ^1^-1 



i i<j 

We first show that the Jacobi ensemble satisfies a log-Sobolev inequality, which is strictly stronger than the 
Poincare inequality. Define the entropy of a non-negative measurable function / with respect to a probability 
measure fi by 



Ent M (/) :=Jf log fdn - (J fdA flog J fdfi 



if J /log(l + f)d/j, < oo and +00 otherwise. Our tool in this direction is a consequence of the well-known 
Bakry-Emery condition, the content of which is contained in the following proposition (see Proposition 3.1 
ofi). 

Proposition 4.2. Suppose that d/j, = e~ u dx is supported on a convex set fi. // there is a c > so that for 
all x £ int(fi), Hess U(x) > eld, where Id is the identity matrix and > is the partial ordering on positive 
semidefinite matrices, then for all smooth functions f on K™, 



ViMf 2 )<- c J\\7f\ 2 d f i. 



To prove the log-Sobolev inequality with the appropriate constant, we need only check that the condition 
of Proposition 14. 21 is satisfied. This we do in showing the following lemma. 
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Lemma 4.3. The Jacobi ensemble satisfies a log-Sobolev inequality 

Ent MJ (/ 2 )<? f\Vf\ 2 dfij, 
lOT*ftc = 4^min{^-l,±=* - l) . 



Proof. We will employ Proposition 14.21 and thus we begin by computing the Hessian of the logarithm of the 
density. Let p := ^ [£ — l] + i — 1, and let q := ^ [-^ — l] + i — 1. The first derivative is given by 

d n i a \\ p q 1 V^ X 



The second derivative is thus 

d 2 „ , , ,, 

, -,' (1-A 4 ) 2 a . 

The mixed partials are just 



r (log(^))=-^-— « -l£. 



— - — - (log(d/*j)) 



By the method of Gershgorin discs we conclude that the smallest eigenvalue of Hess(— logdfij) is at least 

P Q 



mm 

l<i<n 

0<A,<1 



A 2 (1-A,) 2 



An 
>4min{p,g}> — min{£- 1,^-1} 
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It is now a simple manner to show the needed concentration inequality and prove Theorem 11.31 

Proof of Theorem \1.3[ From Proposition 12.91 and Proposition 14. 11 it suffices to demonstrate a constant C so 
that Vartr/ < C||/||^_, with the Lipschitz norm on [0, 1], for all Lipschitz /. This is turn follows from the 
somewhat sharper inequality that 

Vartr/ < C J \d Xt (f(X t ))\ 2 dnj(X u . . . , A„) = | / |V tr/| 2 d Mj (Ai, . . . , A n ), 

where in the last step we have used the symmetry of the linear statistic. It is a standard fact that the 
log-Sobolev inequality implies the Poincare inequality with half the constant (see [22 Chapter 5]). Thus by 
Lemma 14.31 we have that for all smooth functions /, 

Vartr/ < . }b " — b , f |V tr /| 2 d Mj (Ai, . . . , A„). 

4nmin{- -L — -1)7 

Extension to Lipschitz functions follows from the density of smooth functions in L 2 , and the proof is complete. 

□ 

5. Computing the Expectation 

In this section, we will prove Theorem 11.61 To establish the theorem for polynomial linear statistics <f>, 
a proof will be given that follows a similar tract to the analogous statement proven for the Laguerre and 
Hermite ensembles in [17] . The key to this method of proof is establishing a certain palindromy. Recall that 
a polynomial p(z) = a n z n + a n -\z n ~ x + ■ ■ - + aiz + ao is palindromic in z if a n z n + a n -\z n ~ l + • • - + aiz + ao = 
aoz n + a\z n ~ l + • • • + a„_iz + a n , or equivalently that p(z) = z n p(z~ 1 ). 

Theorem 5.1. The scaled moment iEtr(yl fe ) has a series expansion 

oo 

whose coefficients T}k(j,(x) are palindromic polynomials in (—a) of degree j. 
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While the proof of this palindromy works for all of these coefficients r\ simultaneously, only the palindromy 
of r]k(0,ce) and %(l,a) are required for Theorem 11.61 Especially, palindromy forces rjk(0, a) to have no a 
dependence, and it forces 77fe(l,a) to be a multiple of 1 — a. As will be seen, this allows the a = case to 
be used to study the arbitrary a case. As the proof of Theorem 15.11 requires symmetric function theory, we 
delay the proof to Appendix to allow a brief introduction to the relevant symmetric function theory. 

Proof of Theorem ] 1.6\ for polynomial <f>. Formally, let to(x) be the moment generating function for the en- 
semble, and expand each moment asymptotically around n = oo, i.e. 



1 Etr(A k ) 

™w = - E vk = E x ~ h E wo*. «) 

then one has, to order i 



n ' — ' ar 

fe=0 k=0 j=0 



m{x) = f] x- k (%(0, a) + Hh^y\ + (n- 2 ). 

The a-dcpcndcncc of cither of these terms is completely determined by Theorem 15. 1[ as r]k(0,a) can have 
no a dependence, and 77^(1, a) is a multiple of (1 — a). Define mo(x) and toi(x) so that 

™( x )\ a =o = m o( x ) + k mi ( x ) + 0{n~ 2 ). 
In this notation, the palindromy shows that 

m(x) = itiq(x) + (1 — a)-mi(x) + 0(n~ ). 

Further, the a = case, for fixed n, is relatively simple. As observed by Sutton [46], the Jacobi matrix 
model tends to a deterministic one as a — > 0; precisely, it has eigenvalues that are the roots of J^ s , the 
Jacobi polynomial of degree n and parameters 

r = n (i_l) >8 = n (l=6_l). 

Suppose that the roots of J^ s are given by {Ai}™ =1 . Then for a = 0, the moment generating function takes 
on the form 

ft(*) = i EE4 = 1 E^r = i M s (*))' = A wr- 

n *— ' *--' x K n *— ' x — \i n n J n (x 

k=0 i=\ x=l V y 

Using the differential recurrence for Jacobi polynomials, it follows that to(x) satisfies a formal power series 
equation 

r+l _ x r±s±2 1 1 r+s+1 -/ 

(45) to 2 + -2— ?_ m + — «_ + _ = 0. 

x(l — x) x(l — x) n 

It follows that the constant-order term too satisfies 

2 6— a— (1 — 2a)x 1— a 

aw ° + x ( i - x) — mo + xTr^)=°- 

This leads to an explicit form for too, 



(a - 6) + (1 - 2a)x - ^/(b - a - (1 - 2a)x) 2 - 4a(l - a)x(l - x) 

m o = 7 s 

2ax(l — x) 



(a - b) + (1 - 2a)x - y/(x - A_)(x - A+) 



2ax(l — x) 
where 

A+ = 



>/6(l - a) ± Vo(l - b) 

Note that A± are always real, and that they are always on [0, 1]. They are and 1 exactly when a = b 
and when a = 1 — b, respectively Taking an inverse Stieltjes transform gives absolutely continuous part 



v /-(x-A_)(x-A + ) i 
MX) = 2nax(l - x) ^-^ 
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This integrates to 1, as it can be shown that 

/• A+ y-(*-A-)(z-A + ) _ 



x(l — x) 



1- V ^37- V / (1-A-)(1-A + ) 



2ixa. 



Note that this implies that the distribution has no discrete part. 

In the same fashion, one can also derive an explicit form for m\. Pulling out the — terms from f|45[) . one 

is left with 

(b - a) - (1 - 2a)x 1 - 2x a . 

2am mi H mi H -am a -\ - + arrir. = 0. 

x(l — X) x{l — X) x(l — X) 

Solving for mi, 



m-i = 



■ + I(A_ + A+) + ^(i-A+Ki-A-) 



2(a;-A + )(or-A_) 

To recover the density, one again applies the inverse Stieltjes transform. When x is neither A + nor A_ , the 
limit linie^o mi (a: + ie) exists, and 

limmi(x + ie) = ^1(a \+)( x )- 

e->o U ; 2ir^/-{x-\+){x-\-) (A -' A+n ' 

Computing the inverse Stieltjes transform at either of the poles, it is seen that there are point masses, so 
that the entire signed measure is 



v{x) = \6 x _{x) + \6 x+ (x) 



1 



2TTy/-{x- \+){x- A_) 



l(A_,A+)(^)- 
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6. Numerics for the Extremal Case 

In this section, we investigate the choice p = q = 1, which was not covered by Theorem ll.3l The method 
of proof breaks down in this extreme case, and so we have run a numerical simulation to help conjecture if 
the theorem extends. 



2r 

1.5- 

1 - 

0.5 



-0.5 

-1 - 

-1.5- 



QQ Plot of Sample Trace Data vs. Standard Normal 



Empirical PDFs for Sample Trace Data 




-2-1012 
Standard Normal Quantiles 

(a) Quantilc plots for tr(A) experiments 




(b) PDF plots for tr(A) experiments 



Figure 1. Experimental data for different values of /3, with n — 5000 and 50000 samples 
of each. All experiments were run in Matlab R2010B, using the Edelman-Sutton matrix 
model. 
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In the alternate parameterization we have that a = ■k and b = h. The density of the Jacobi ensemble 
becomes 

(46) d f , J (\ 1 ,...,\ n ) = ±i[\r\i-K)2- i i[\\ l -\ 3 f- 

i i<j 

Note that the constraining potential no longer carries any dependence on n. However, because the particles 
are forced to lie on [0, 1] (physically speaking, they are trapped in an infinite potential well), it is likely that 
we have some limiting behavior. For polynomial test functions and (3 — 2, this case is covered by a theorem 
of Johansson (see Theorem 3.1 of [M]). 

However, the method of proof used here breaks down in the case a < 2 , as it requires the entries of the 
sparse matrix model to have uniform variance estimates on the order of n _1 . When a = ^ the matrix model 
entries are 



Beta(§«, §i) and c- ~ i/Beta(f i, §(z + 1)) 



The variances of entries C{ and s, are on the order of i , for which reason many of the arguments in later 

i case is from the a < h 



sections are no longer valid. To see how different the a = b = i case is from the a < h case, consider taking 



f(x) = x. It is easily seen that 



3C 



^,a^V( c ?"|)(i-(cD 2 -(cU) 2 ), 



8 = 1 

with the convergence in L 2 . Note that while a normal limit is expected if the summands are becoming 
infinitesimal (and this is what happens when a < |), the normal limit here must follow from something else; 
in particular, the staircase dependency structure of the variables can not be ignored. We invite the reader to 
check that the variable is symmetric and to note how much cancellation occurs in computing the second and 
fourth moments (they are 1/(8/3) and 3/(64/3 2 ) respectively). Again, the fact that this variable is normally 
distributed follows from the mentioned theorem of Johansson. 



Appendices 



A. Symmetric Functions 



To find the asymptotic distribution of the traces, we will appeal to Kadell's integral formula [27] . This 
formula makes use of Jack functions, and so we will provide a skeletal introduction to the relevant portions 
of symmetric function theory. A more expansive treatment is available in Macdonald's book [33], whose 
notation we will follow. 

By a partition A, we mean a non-increasing sequence of positive integers. The notation A h n, read 'A 
partitions n,' means that the sum of the parts of A equal n. There is an important pictorial representation 
of a partition called a Young diagram. The diagram representation of a partition (Ai, . . . , A„) is drawn 
by placing Ai boxes horizontally in a row, placing A2 boxes horizontally below that, continuing through n 
and left justifying each row. Having drawn a diagram representation, we can easily define the conjugate^ 
partition A' to be that partition represented by reflecting the diagram across the vertical axis and rotating 
counterclockwise by a quarter turn. 

Example A.l. The partition A = (5, 4, 1) is to the left, and its conjugate A' = (3, 3, 2, 2, 1) is to the right. 



This is also called the transpose. 



:i(i 



Many formulas in symmetric function theory have sums or products computed from statistics of the 
diagram representation. For our purposes, we will need the arm length a, arm co-length a' , leg length I, and 
leg co-length I' of a box s. The statistics a(s) and a'(s) are the number of boxes to the right and to the left 
of box s, respectively. Likewise, the statistics l(s) and l'(s) are the number of boxes below and above box s. 



This is A= (6,5,5). 



• 



a(s) = 1 
a'is) = 3 

l( S ) = 

I' (a) = 2 



Example A. 2. 

The ring of symmetric functions A, are all those formal power series with complex coefficienttQ in the 
indeterminates {x±,X2, ■ ■ •}, that are symmetric under permutation of the indices. In this application, the 
symmetric functions will be evaluated at some point y = (2/1,2/2, ■ • ■ ,2/«) € C™, where it is understood that 
f(y) — f(yii 2/2, • ■ • > Vni 0, 0, . . .). In this way, symmetric functions specialize to symmetric polynomials. 

The symmetric functions of interest here are the power sums, as they describe traces. For an integer fc, 
define pk by 

Pk = x\ + x\ + x\ + • • • , 
and for a partition A = (Ai, A2, . . . , A n ), define p\ by 

P\ =P\iP\ 2 ■■■P\ n - 
These are called the power sum symmetric functions, and {p\}\ are a basis for A. Note that the trace of a 
power of a matrix tr A k can alternately be expressed as pk evaluated at the eigenvalues of A. 

The second basis we require are the Jack symmetric functions P". For those interested, there is a concise 
introduction available in Stanley's paper |45j . By virtue of being a basis, it is possible to write pk as a finite 
linear combination of {-P"}Ai-fe- 

There are multiple normalizations for the Jack functions in the literature. In citing some theorems, we 
will require a second normalization, J". The two are related, as J" = c(A, a)P", where 



(47) 



c(A, a) = Yl (aa(s) + l(s) + 1) , 



sex 



using the arm length a(s) and leg length l(s). 

One final tool we will use is the Macdonald automorphism u a - It is defined in terms of the symmetric 
power functions by ui a Pk — ctPk ', it is extended to each p\ as a multiplicative homomorphism; and at last it 
is extended to all A as a C-linear transformation. This automorphism acts on the Jack functions in a nice 
way as well, as by a formula of Stanley [?S] , 

(48) w_ 1/a J^ _I = {-a) W J A Q - 



More often in the literature on Jack functions, these coefficients are defined to be from Q(a), but the distinction here is 
immaterial. 
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A.l. Kadell's Integral. Kadell's integral (see [57]) is a generalization of Sclberg's integral @D], which states 
the following 

j n \« - ^i^n^d - so- 1 ** = n r( m l^^+tlf } - 






It was generalized to include the Jack function P^ (x) in the integrand. Letting W(n,a,r, s) be the 



integrand of Selberg's integral, Kadell's integral is 

MCA f pl/«, «« w , a1 ! V r (^+ ? -+ I ^) r ( g + I ^ i ) 
( 49 ) / p x (x)W(n,a,r,a)dx = n\v x [[ a 2n _ i _ 1 a , 
J [o,i]" " l(Ai +r + sH — ) 



where the term v" is dchned as 



(so) «?=n r(A, " A ' ~ 



Our goal is to show that 



; ;, r(A,-A J + ^) 



p i/» 
D i/i rr W(n,a,r,s)dx, 
[0,1]'* ^ U»J 



where /„ = (1, 1, . . . 1) has n l's, has a quasi-palindromic property. The constant P x (I n ) is computable in 
terms of diagram statistics. From formula VI. 10.20 of [33], 

(5i) pi'\in) = n ( "^'it;;? ) = spfer n (» + *»'« - ^» 

sGA sGA 

where c(A, a) is the constant that relates J" and P" (see (|47|) ). To compare the two, we will convert Kadell's 
expression using T functions into a Young diagram formula. 

Recall that a quotient of T functions, also known as the Pochhammer symbol (x)k, may be expressed 
alternately as 

r( pA fe) = (x)k = (x)(x + 1) • • • (x + k - 1), 

when k is a natural number. Define the generalized Pochhammer symbol (t) M (also known as the shifted 
factorial) to be 

(52) (*)„= n (* +o, w -w) ■ 

sg/i 

In terms of these expressions, (f5Tj) can be rewritten as 

(53) pi/° (fn)= (SM a ) |A| . 

c(A,a) 

We will need a closely related quantity to c(A, a), so define c'(A, a) to be 

J\_(aa{s)+l(s) + a). 
seA 

Both c(A, a) and c'(A, a) can be expressed as products of T terms, which we will need to rewrite Kadell's 
integral. Write out the terms in a~' A 'c'(A,a) by going from right to left along the first row of the diagram 
of A. There are Ai — A2 terms that have l(s) = : 

(2 + 1 + ) (£ + 1 + 1) ...(£ + l + Al _A 2 -l)= r(Al r( ^ ) 2 + 1) . 

There are then A2 — A3 terms that have l(s) = 1 : 

(i + l + A 1 -A 2 )(i + l + A 1 -A 2 + l)...(i + l + A 1 -A 3 -l) = ^i-^±i±||. 
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This pattern continues until at last there are A n terms that have l(s) = n — 1 : 

(s=l + l + Ai-A n )(^ + l + A 1 -A n + l)...(2=i + A 1 )= IlA '" i_i 



r(Ai-A n + l + ^)' 
Writing out all the terms in the first row gives 

r(Ai - a 2 + 1) r(Ai - a 3 + 1 + 1) r(Ai - a 4 + 1 + f) 1^ + 1 + s_i) 
r(i) r(Ai - a 2 + i + 1) r(Ai - a 3 + 1 + §) " ' r(Ai - a„ + 1 + 2=1 ) ' 

Inducting over the rows, it follows that c'(A,a) can be written as 

(54) c'(A, a) = (a)' A l TT F(Ai ~ Aj + * + ^ ±) f] T(A t + 1 + s=*) 

t<) r(A,-A, + l + ^) fj 

If one does the same expansion along the first row for c(A, a), one gets 

r(Ai - A 2 + i) r(Ar - A 3 + 1 ) r(Ax - A 4 + f ) r(Ar + g) 



r(i) r(Ai - A a + f ) r(Ai - A3 + 1) r(Ai-A„ + f)- 

Repeating the analogous procedure for the rest of the rows, we eventually conclude 

(55) o(A, a) - ww n "f ' - A \ + ,g' fi r(A ' r + (ir' • 

Equations (|54l) and (1551 allow (|5D|) to be rewritten as 

(V a w q _ m^ rr r ( A ' + IL ^ ±1 ) 

l 5b i W A - C (A, Q ) 11 p/lN 

We can repeat the same procedure as used for c and d to show that (t)\ can be computed by 
This allows the expression in ([56]) for u" to be replaced by 

•n i—X\ ,,, n Y(i_\ 



(58) v 



_ (a) 

A — 57X 



xi" r(A i + g-H)r(g-H) _ (a]|X| "r(| 

a) 11 TYl"\ TV" i— I 'i c(A,a) \a) A 11 t-i/J. 



j=l 



p/j_\ pC n _ i— 1 \ c(A,a) Va/All TY — ' 



Pl /a (In) to get 



Combine this expression for v" with Kadell's integral formula (j49|) and the simplified expression (|53|) for 

^— -lf(n,a,r,s)di 
[o,i]" Pa 7 ( J n) 

c(A,a) (a)l^l f „, AT(^) " r(A. t + r + ^)r(.s + ^) 

(S)aW' a ' 'c(A,a) U ^fJr(i)fJ r(A i+ r + S + ^-i) 

_ " r(i + 1) r(A t + r + ^i)r( 3 + ^) 

!Jr(i + i) r(A. 1 + r + s + ^-i) • 

Let [ij be the ( — , r, s)- Jacobi ensemble measure on [0,1]". This has density function proportional to 
W(n,a,r,s), but it is appropriately renormalized to be a probability measure. This normalization is given 
by Selberg's integral. 
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The integral expression above can be rewritten as 



[0,1]" Pl /a (I n ) 



dfij(x) 



f=i r ( 1+ a) r(A 2 + r + s+2ii__^I) JW{n,a,r,s)da 






n 



a l\ 



\r + s- 



a I A 



(59) 

A. 2. Palindromy. 
Lemma A. 3. Let 

y[o,i]» p * ( J ») fc=0 

&e t/ie series expansion about n — oo. XTie coefficients p(k, A, a) are skew-palindromic in that 

p(fc,A,a) = (-a)V(fc,A',i) 

Proof. In the calculation that follows, let f(X,a,t) = f(t) = aa'(t) — l'(i), for tableau block £ e A. Starting 
from the formula computed in (|59p . and applying formula (|52[) gives 



'nfe\ 
v act / A 



f — ) ^ 

Vaa/A tgA 



nb + af(t) 
n + af(t) 



11 1 + -/(*) 

/ CO 

=n »(i+c/w)E(-rf 



teA 



fc=0 
oo 



^n i + d-i)E(-W)) 



teA 



fc=l 



Let M(A, ft) be the collection of all fc-element multisets sampled from A. If r e M(X, k) is such a multiset, 
let m T (t) denote the multiplicity of t £ r and let e T (t) be the characteristic function for t G r. The sum can 
be written as: 



bWJV 



fe=0 



rSM(A,fe) t£A 



l\&r(*) 

6/ 



This gives an explicit form for the coefficients /(ft, A, a). Mapping A to A' induces a bijection mapping the 
collection M(X,k) to M(A', ft). In the conjugate, the arm co-length a' and leg co-length I' are reversed, so 
that f(i) becomes al'(t) — a'(t). Thus f(X,a,t) = —af(X',a~ 1 ,t), so that 



e n(-«/(A,«,*)r (t) (i-i) eT(t) =(-a) fc e n(-°/(*'. a " i .*)) mrW ( i -i) 



e T (t) 



rGM(A,fe) teA 



rGM(A',fc) tGA' 
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Let J A be the Jack functions renormalized by 



(60) 



J/ = c(A,a)P/ 



.-!! 



Expand the symmetric power function pk as 

Ahfc 

By applying Stanley's formula (see (@S])), it follows (see [T7]) that 

(61) e(A,a) = (-a) 1 -l A le(A',a- 1 ). 

One last piece is needed. The normalization factor J\ (/„) can be computed by relating (|53j) and the 
definition of J" in ()60|) . These two combined give that 

Jx a{in) = (s) A («) iai = n ( n + aa 'w - ^)) ; 

teA 
expand this as a polynomial in n, i.e. put 

I] (n + aa'(t) - l'(t)) = £c0', A,a)n*. 
teA j=o 

Because the product can be expressed as 

J] (n + cw»'(t) - l'(t)) = J] (n + (-a) (crV(*) - J'(i))) , 

tex te\' 

it follows that 

(62) C(i,A,a) = (-a)l A l-^(i,A',a- 1 ). 
Proof of Theorem \5.1l Expand pry in the Jack function basis: 



Ahfc 

Ahfc ^A lAv 



Apply Lemma lA. 31 and expand J x (I n ). Note that the alternative normalization used in the Lemma cancels 
out. 

(fc \ / oo 

^C(i,A,aK $>(j,A,aK 

k I k N 

= E » , '" 1 (Etf A > a )£c(J,A,a)p(l-i,A,a) 

3=-oo \Ahfc (=0 y 

with /?(/ — j, A, a) =0 for negative / — j. 



This gives a formula for 7]k{j, a), namely that 

fc 
%(i: a) = E ^ A ' a ) E £('' A ' a )^ + J - !> A, a). 

Ahfc 1=0 

The j < terms vanish, which can be seen because the trace can naturally be bounded as 

n 

i|E«p fc |<±E Q ]>>,!*< In =1, 

i=0 

as the Jacobi distribution is supported on [0, 1]". 

35 



We will show that each rjk(j, a) is palindromic. Applying Lemma rA.3[ (jfJTj) . and (|p^|) . these can be written 



a.s 



T) k (j, a)=J2 £(A, a) J2 C(l A, a)p{l + j - 1, A, 

Ahfc 1=0 



J2(-^ 1 - k ^'^- 1 )J2(-^ k - l C(l, A', a^X-a)^'- 1 ^ + j - 1, A', a" 1 ) 

\\-k 1=0 

k 

J2 e(A', a" 1 ) £ C(«, A', a- x )p(i + j - 1, A', a" 1 ) 



= (-«) J 

\\-k 1=0 

The sum is over all partitions of k, so taking conjugates makes no difference. Thus, 

Vk(j,a) = (-a)- J 77 fc (j,a _1 ). 

The last claim we make is that %(j, ex) is a polynomial in a of degree j. This is more involved, and 
requires that we appeal to Edelman and Sutton's tridiagonal matrix model (see the start of Section 3). The 
moment —¥,pk = —Etr(A k ) can be written in terms of a sum over alternating bridges (see Section 3.1), 



lEtrA fc = I £ E(B ) 



w-\-i 
w£A 2k 

A priori, these expectations are moments of random variables distributed as the square root of a Beta 
random variable. However, by Lemma 12.61 the alternating bridge visits each matrix entry an even number 
of times. Thus, any term in the sum takes the form 

fe 

11 "2i-l ^2i-l U U2i * ^2i ' 

8=1 

where u>i ranges over the matrix entries referenced by the bridge w and ^ TOj = k. By independence, this 
expectation is a product of terms of the form 

Ecl m sl n and Ec'fTVf" 

UJ UJ UJ UJ 

By Lemma lA.41 each such Beta moment admits a series expansion around n = oo and a K so that 

oo 

E ( B 0U+i = E n~ m a m *W*,m(n), 

■m=0 

where < Vt a+itm (n) < K m for all n. Moreover, this constant K can be chosen independently of w + i. 
Thus the entire trace admits such a series expansion, 



_. n oo 

^tr(A k ) = -J2 E E n- m a m n a+itm (n) 

i—1 w€A2k rn=0 
oo / 1 n \ 

= E«~ m « m kE E <VhW0 

m=0 \ i=l w£A2k / 



Because the cardinality of ^2fe is at most ( fe ), the sum f2 m (n) = ^ S™=fe+i S?bg-4 ^t»+i,mW satisfies 

•2fc~ 



an estimate < ft m (n) < ( 2 £)K m — CK m . Thus there are two expansions for the trace, valid for all n 



sufficiently large, i.e. 

(63) JT ri(j, a)n- j = -EtiiA k ) = >T Q,(n)a 



oo 

// 

3=0 3=0 

The left hand side expansion shows that the n — > oo limit must exist. Thus 

oo oo 

77(0, a) = lim N r](j,a)n~-' = lim N fL-(n)ar'n. _ ' 7 = lim fioC^)- 

n— >oo -*- — ' n— >oo ^ — ' n— >oo 

3=0 3=0 
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In particular, 77(0, a) has no a dependence. The proof now proceeds by induction. Suppose that for all j < I, 
the term r](j,a) is a polynomial in a of degree j. It should be shown that r](l,a) is a polynomial in a of 
degree I. The limit 



lim 



l-i 



= lim V^ r)(j + l)n ■? = r](l, a) 

n. — Vrxr-, ' J 



3=0 



exists by virtue of the r\ expansion, and by substituting the right hand side of (p5|) . it follows that 



rj(l, a) = lim n 



1-1 



1-1 



E &j {n)oP n J - 2J vU, a ) n 

J=0 j=0 j=0 

By the inductive hypothesis, this limit can be written in the form 

rj(l,a) = lim f Q (n) + fi(n)a + / 2 (n)a 2 



lim V] [Qj^a? - r)(j, a)] n l j + D,i(n)a l . 



fi{n)a l , 



and the limit exists for each fixed a. Take I + 1 distinct values of a. The convergence is uniform on this finite 
set aa, . . .,ai, and so each fi(n) converges, where < i < I. Thus 77(7, a) is a polynomial of degree I in a, 
concluding the proof. □ 

Lemma A. 4. Let f r (n) and f s (n) be positive real-valued functions defined on N so that 

< f r (n) < C 2 , < f s {n) < C 2 i, C x < f r (n) + f s (n), 

where Ci are some positive constants. Let r = a~ l f r (n)n, s = a^ 1 f s (n)n, and let X ~ Bcta(r, s). There is 

an asymptotic expansion 

00 

E [X fc (l - X)'] = Y, n- m a m p m (n), 

m=0 

and a constant K depending only on k, I, C\, and C 2 so that < p m (n) < K m . 

Proof. The expectation, which can be computed using Eulcr's Beta integral formula, gives that 

E[x fe (i-x)H = ( f ) fc (f)' . 

L J (r + s) k +i 

Substituting in the definitions for r and s and writing out the Pochhammer symbols gives 

fc-i _i j, 1-1 _i , 

a f r n + i -pi- a f s n + i 



TT— - 



llrv-lr 



T=t> a 1 (fr + fs)n + i fj^ a 1 (f r + f s )n + k + i 



All rational terms in this product produce similar asymptotic series expansions, and so we will only examine 
one. Working with a term from the left hand product, 



a 1 f r n + i 



a 1 f r n + i 



a- 1 {f r + fs)n + i ya-^fr + f s )nj \ 1+ — t 



1 



Provided that n is sufficiently large (depending on C\ and a), this can be expanded as a series. 
a~~ 1 f r n + i a~ 1 f r n + i 



a- l {f r + fs)n + i cr x {f r + f s )n 



E (A + f>)~ 



fr 



fr + fs 



+ E (ttTTT+O (fr + f s y m n- m a 



^faW" m " r 



m— 1 
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The coefficients p m (n) satisfy an estimate 

0<p m (n)<{C 1 C 2 + k)(C 1 y m . 

D 

B. Poincare Inequality for Beta 
Lemma B.l. Let Y ~ Beta(p, q). For any Lipschitz function f on [0, 1], 

Var/(y)<^-L_E|/'(y)| 2 . 

We note that in the case that both p and q are greater than 1, the density is log-concave, and it is possible 
to use the general theory outlined by Bobkov in [8] to produce an equivalent bound, but we require the 
inequality to hold for all p and q positive, and thus we use an alternative technique. 

Proof. We begin by showing the analogous bound for the translated random variable X = 2Y — 1 , and write 
Y = T(X) := \{X + 1). The density of Y is given by 



% = ±(i-,r 1 d+*) 



9-1 



UjtAj £J <T\ Q 



We will show that for any Lipschitz function / on [—1,1], that 

(64) Var/(X)<-1-E[(l-X 2 )|/W 

As will be seen in the proof, this inequality is attained taking / to be a multiple of the linear Jacobi 
polynomial (for definitions, see [17]). The proof follows from (|64l) . as 

Var/(Y)=Var(/oT)(X) 



{l-X*)\(foT)'{X)Y 



< E 

p + q 

<-±—E\(foT)'(X)\ 2 
p + q 

1 „„., ^,2 l 2 



E|(/'o T)(X)\< 



p + q 2 

1 rEi/'cni 2 . 



4(p + q) 

The method of proof follows the general outline in the notes of Bakry 5 . Define the Jacobi differential 
operator L to be 

Lf = (1 - x 2 )f"(x) + (q-p-(p + q)x)f'(x), 
and define the carre du champ operator T by 

T(f,g) = (l-x 2 )f'(x)g'(x). 

It can be checked by integration by parts that for all C 2 functions on [—1, 1] that the Dirichlet form £(f,g) 
associated to L satisfies 



£(f,g):=-J f(x)(Lg)(x)d^(x) = J 



f( x )( L g)(x)dfJ,p(x)= / T(f(x),g(x))dfip(x). 

The spectrum of L restricted to L 2 (/i / g) is non-positive, with eigenvalues y n — —n(n + p + q — 1) for non- 
negative integers n. Further, its eigenfunctions are given by the Jacobi polynomials PP~ 1,q ^ 1 (x), which when 
normalized form a complete orthonormal system for L 2 (/x^). From the density of the polynomials in L 2 ^^), 
it is an immediate consequence that 

. , £(f,f) . . J\r(f(x),g(x))d^(x) 

V + q = —Vi — ml ; — - = ml ; — ■ , 

/gl 2 (m) Var/(X) /6i =V) Var/(X) 

E/(X)=0 E/(X)=0 

which upon rewriting, gives (|64|) . D 
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C. Coupling Bound for VBcta 

We provide an auxiliary lemma regarding the square root of Beta variables that appear in the matrix 
entries. Note that because one of the parameters of the cj family is not Q(n) for all i, this approximation 
can not be applied to every matrix entry with uniform error. 



Lemma C.l. IfY is distributed as y/ Betainp, nq), then 



2{P + q) -^(Y-^)^N(0,l) 



Vq \ V p + q, 

as n — > oo, where p, q are fixed positive constants. Moreover, it is possible to couple Y to a standard normal 
X so that _ 

for some K PtQ > 0, independent of n, and continuous in p,q positive, provided that n > max{i, -}. 



Var [Y - - v \ ^ X < 

2(p + q)V^ 



Proof of lC.il Let Y be distributed as -\/Beta(np, nq). Put 



P / q 

M = 



P + q V 2n(p + q) 

Note that these are not exactly the mean or standard deviation of Y, however, 

Y = — t^N(0,l). 

a 

Moreover, it will be shown that there is an X distributed as iV(0, 1) so that 

E(Y -Xf < — . 
n 

for some K = K(p,q) depending continuously on p, q positive. Note that this implies Lemma I C.l I after 

dividing through by a. 

The primary machinery here is Talagrand's transport inequality, which bounds the square L 2 -Wasserstein 

distance of Y and X, with X distributed as N(0, 1). We use a special case of Theorem 1.1 of [IS], which 

states 

Proposition C.2 (Talagrand). Let Y be a random variable given by probability measure v, which is absolutely 
continuous with Lebesgue measure, and let 7 be a standard Gaussian measure. There is a standard normal 
random variable X so that 



E(Y -X) 2 <2 J 'log &&. 



The density jr- of Y can be computed to be 



ay 1 [np)i [nq) 



for y £ [0, 1]. It follows that density of Y is given by 

f = 2*(» + yaf^ { l - (, + y^f^^^L 

ay T(np)T(nq) 

and thus the Radon- Nikodym derivative ^ (y) is a product of four terms 



(i) (ii) (Hi) (iv) 

The logs of terms (i) and (ii) can be controlled by Taylor expansion. Explicitly, 



ln[l+y]°=g]n{l + y)<g 
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y 2 , y 3 



for all y > — 1, and all g > 0. Note that both produce a nonzero constant term, by virtue of the relationship 
\n(a + y) = ln(a) + ln(l + y/a). This bound is applied to the logs of both (i) and (ii) after suitable rearrange- 
ment. This bounds the sum of the logs by a polynomial in y of degree 6. We can bound the log of term (i) 



In [(/I + ya) 2np - 1 ] = (2np - 1) ln/i + (2np - 1) In 



_jV£_ 



7V / 2"pn 



< (2np - 1) 



ln/z - 



yyq 

~t^/2pn 



i / y^g \ , i / v\/g y 

2 ^V^pH^ + 3 ^v^H^ 



Applying the same to term (ii), 
ln[(l -^ + yafr- 1 ]< 



(nq - 1) 



ln(l-^ 2 ) 







OS 




/ ,- 




2\ 2 




/ _ 




2\ 3 1 


vVp 


V 




-i 


2 V £- 1 


!/ 


- 


-I 


2 V ^L 1 


!/ 




^/2qn 


_V2n_ 


J 


2 


I V 2 <2" 


_ V2n_ 


/ 


ci 


1 V 2 9" 


L V2n_ 


/ J 



From this form, it is easy to see that the coefficients of this polynomial depend continuously on p and q. 
Further, the coefficients of y 4 ,y 5 ,and y 6 already decay at least as fast as 1/n. The coefficient of y 3 decays 
like nT 1 ' 2 , so some amount of control over EY 3 will need to be gained. The coefficients of the lower order 
terms to do not a priori decay at all, but there is strong cancellation. The constant term is 

C (p, q) := (2 np - 1) In (p) + (n q - 1) In (l - M 2 ) , 

the linear term has coefficient 



Ci(p,q) 



(2np—l)o- (nq—l)ii<T 



ft 



1 



/' 



and the quadratic term has coefficient 



C 2 (p,q) 



-1/2 



(2np- l)cr 2 



+ (nq-l) 



1-fi 2 



fi 2 a 2 



The — ^ in the quadratic term represents the asymptotically Gaussian portion, and it annihilates term (Hi). 
This leaves four sources of error that need to be controlled to show the desired 0{n~ l ) bound: 
_i 

(1) \EY\ < C(p,q)n 2 to control the linear term. 

_i 

(2) |EF 3 | < C(p,q)n 2 to control the cubic term. 

(3) |E(y) fe | < C(p,q) to control the second, fourth, fifth, and sixth terms. 

(4) The constants from the Taylor approximation and the constants from part (iv) of the Radon-Nikodym 
derivative need to cancel to order 0(n~ l ). 

The raw moments of Y are easily computable, and their formula follows immediately from Euler's Beta 
integral, 



E(YY 



r(n(p + g))r(np + |) 
T{n{p + q) + ±)T{np) 



Appropriate control over the first 6 raw moments could be achieved by taking sufficiently many terms from 
the Stirling approximation and canceling terms. To some extent, doing such a procedure is necessary, as this 
is necessary to get the precise control over the first and third raw moments. However, we will not need to 
do this for all 6 moments, because we can appeal to a Poincare inequality. Provided that n > max{-, -}, 

the density 4^ is log-concave. Thus if it can be shown that Y has constant order variance, we can use the 
Poincare inequality to bound higher moments by lower moments, i.e. 

Var/(F)<CE|/'(y)| 2 , 



applied to f(Y) = (Y) k , gives 



EY 2k < 



EF' 



C"fc 2 Ey 2fc - 2 . 
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Because of the log-concavity, C can be taken to be 12E|Y | 2 (see Corr 4.3 of [5]), which is continuous in p and 
q. Thus provided that E|Y| can be bounded by some continuous function in p and q, iterating the Poincare 
inequality gives constant order bounds that are continuous in p and q for all absolute moments. Further, 



EIYI < 



Y 



so the problem has been reduced to finding good bounds for the first three raw moments of Y. 

By appealing to Stirling's formula, and using that the error-in-approximation is bounded by the first 
omitted term in the asymptotic expansion, the first three moments of Y can be bounded by 



E Y 



< 



WpIp + v) 



e(y) 2 


e(Y) 3 



<1, 



< 



4 \/qp{p + q) 



It only remains to control the constant terms. The log of (iv) can be approximated by Stirling's formula: 



In 



T(n{p + q) r— 
2<r— - — — — — -v 2vr 



T(np)T(nq) 



—np In fi — nq ln(l — /j, ) + In ■ 



gyp 

(p + q)- 



< 



l l 



i 



12 n y/p^/qy/p + q 



Comparing this with the constants produced by the Taylor approximation on terms (i) and (it), it is seen 
that only the 0(n _1 ) term remains. 
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